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METHODS AND COMPOSITIONS FOR GENOTYPING 

5 

This application claims the benefit of the filing date of U.S. Provisional 
Application No. 60/400,533, filed August 2, 2002 and entitled "Compositions and 
Methods for Identifying Multiple Alleles at the Scrapie Locus," and U.S. Patent 
Application No. 10/328,150, filed December 20, 2002 and entitled "Methods and 
10 Compositions for Conducting Primer Extension and Polymorphism Detection 

Reactions," the entire disclosures of which are hereby incorporated by reference into 
the present disclosure. 

BACKGROUND OF THE INVENTION 

15 Extensive progress in the field of biotechnology over the last two decades has 

given rise to new and promising routes to the identification and investigation of 
genomic characteristics in all species. Specifically, advances in nucleic acid synthesis 
and sequencing have led to the development of the science of genomics. High- 
throughput sequencing technologies have enabled significant milestones such as the 

20 mapping of various genomes, including the human genome. With the ability to 
rapidly sequence large amounts of DNA, large-scale analysis of genomic 
characteristics has become possible. Technologies are now evolving to identify and 
characterize features of genomes pertinent to individual or population-based 
variations in genotypes that may be used for applications such as identifying an 

25 individual's susceptibility to a given disease, identifying characteristics of interest in a 
gene or a genome, and identifying genetic characteristics that cause or promote 
disease states. Among the most promising of avenues for characterizing genomic 
variance in individuals and populations is the analysis and characterization of genetic 
polymorphisms. 

30 

Polymorphisms relate to variances in genomes among different species, for 
example, or among members of a species, among populations or sub-populations 
within a species, or among individuals in a species. Such variances are expressed as 
differences in nucleotide sequences at particular loci in the genomes in question. 



These differences include, for example, deletions, additions or insertions, 
rearrangements, or substitutions of nucleotides or groups of nucleotides in a genome. 

One important type of polymorphism is a single nucleotide polymorphism 
5 (SNP). Single nucleotide polymorphisms occur with a frequency of about 1 in 300 to 
about 1 in 1 ,000 base pairs, where a single nucleotide base in the DN A sequence 
varies among individuals. SNPs may occur both inside and outside the coding regions 
of genes. It is believed that many diseases, including many cancers, hypertension, 
heart disease, and diabetes, for example, are the result of mutations borne as SNPs or 
1 0 collections of SNPs in subsets of the human population. Currently, one focus of 

genomics is the identification and characterization of SNPs and groups of SNPs and 
how they relate to phenotypic characteristics of medical and/or pharmacogenetic 
relevance, for example. 

1 5 A variety of approaches to determining, or scoring, the large variety of 

polymorphisms in genomes have developed. Although these methods are applicable 
to many types of genomic polymorphisms, they are particularly amenable to 
determining, or scoring, SNPs. 

20 One preferred method of polymorphism detection employs enzyme-assisted 

primer extension. SNP-IT™ (disclosed by Goelet, P. et al. W092/15712, and U.S. 
Patent Nos. 5,888,819 and 6,004,744, each herein incorporated by reference in its 
entirety) is a preferred method for determining the identity of a nucleotide at a 
predetermined polymorphic site in a target nucleic acid sequence. Thus, this method 

25 is uniquely suited for SNP scoring, although it also has general applicability for 
determination of a wide variety of polymorphisms. SNP-IT is a method of 
polymorphic site interrogation in which the nucleotide sequence information 
surrounding a polymorphic site in a target nucleic acid sequence is used to design a 
primer that is complementary to a region immediately adjacent to the target 

30 polynucleotide, but not including the variable nucleotide(s) in the polymorphic site of 
the target polynucleotide. The primer is extended by a single labeled terminator 
nucleotide, such as a dideoxynucleotide, using a polymerase, often in the presence of 
one or more chain terminating nucleoside triphosphate precursors (or suitable 
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analogs). A detectable signal or moiety, covalently attached to the SNP-IT primer, 
is thereby produced. 

TM 

In some embodiments of SNP-IT , the oligonucleotide primer is bound to a 
solid support prior to the extension reaction. In other embodiments, the extension 
5 reaction is performed in solution and the extended product is subsequently bound to a 

TM 

solid support. In an alternate embodiment of SNP-IT , the primer is detectably 
labeled and the extended terminator nucleotide is modified so as to enable the 
extended primer product to be bound to a solid support. 

Ligase/polymerase mediated genetic bit analysis (U.S. Patent Nos. 5,679,524, 
10 and 5,952,174, both herein incorporated by reference) is another example of a suitable 
polymerase-mediated primer extension method for determining the identity of a 

TM 

nucleotide at a polymorphic site. Ligase/polymerase SNP-IT utilizes two primers. 
Generally, one primer is detectably labeled, while the other is designed to be bound to 

TM 

a solid support. In alternate embodiments of ligase/polymerase SNP-IT , the 

TM 

15 extended nucleotide is detectably labeled. The primers in ligase/polymerase SNP-IT 
are designed to hybridize to each side of a polymorphic site on the same strand, such 
that there is a gap comprising the polymorphic site. Only a successful extension 
reaction, followed by a successful ligation reaction, results in production of a 
detectable signal. This method offers the advantages of producing a signal with 

20 considerably lower background than is possible by methods employing only 
hybridization or primer extension alone. 

An alternate method for determining the identity of a nucleotide at a 
predetermined polymorphic site in a target polynucleotide is described in Soderlund et 
al., U.S. Patent No. 6,013,431 (the entire disclosure of which is herein incorporated by 

25 reference). In this alternate method, nucleotide sequence information surrounding a 
polymorphic site in a target nucleic acid sequence is used to design a primer that is 
complementary to a region flanking, but not including, the variable nucleotide(s) at 
the polymorphic site of the target. In some embodiments of this method, following 
isolation, the target polynucleotide may be amplified by any suitable means prior to 

30 hybridization to the interrogating primer. The primer is extended, using a 

polymerase, often in the presence of a mixture of at least one labeled deoxynucleotide 
and one or more chain terminating nucleoside triphosphate precursors (or suitable 
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analogs). A detectable signal is produced upon incorporation of the labeled 
deoxynucleotide into the primer. 

Due to the large size of many studies that use SNP information, SNP detection 
must be rapid, amenable to high-throughput and reliable. Reliably interpreting the 
5 results of an assay for polymorphism detection or identification using SNP-based 

applications is an important consideration, particularly when employing multiplex and 
high-throughput protocols. Accurate quantitation of primer extension products is one 
method of interpreting results. 

10 Thus, there is a need in the art of polymorphism detection and identification in 

a system that provides for the confirmation of amplification, and that provides for 
accurate detection and identification of polymorphisms, and that can provide for 
abundance analysis of reaction products, either separately or simultaneously. There is 
also a need for an assay wherein control reactions that mirror the diagnostic assay are 

1 5 conducted under similar conditions, reducing the effect of factors influencing the 
efficiency of incorporation of one nucleotide over another on the interpretation of 
assay results, particularly in multiplex applications. 
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SUMMARY OF THE INVENTION 



In one embodiment, the invention comprises a method of performing a primer 
extension reaction, comprising: obtaining an amplicon having a sequence generated 
5 from a target nucleic acid and a sequence generated from a first strand amplification 
primer, by amplifying a target nucleic acid having a variant nucleotide flanked by an 
invariant nucleotide, wherein a first strand amplification primer is employed that 
comprises a 5' tag substantially incapable of hybridizing to the target nucleic acid 
under amplification conditions, and wherein the 5' tag contains the variant nucleotide 

10 of the target nucleic acid, and employing a second strand amplification primer; 
employing the amplicon in a primer extension reaction wherein the identity of the 
variant nucleotide in the sequence generated from the target nucleic acid is 
determined by hybridizing a first identification primer immediately adjacent to the 
variant nucleotide in the sequence generated from the target nucleic acid; hybridizing 

1 5 a second identification primer immediately adjacent to the variant nucleotide in the 
sequence generated from the amplification primers; extending the first and the second 
identification primers in the presence of one or more nucleotides and a polymerizing 
agent; determining the identity of the variant nucleotide generated from the target 
nucleic acid; and comparing extension product of the first identification primer and 

20 extension product of the second identification primer, thereby monitoring the primer 
extension reaction. 

In another embodiment, the invention comprises a method of performing a 
primer extension reaction, comprising: obtaining a sample comprising target nucleic 

25 acid from one or more individuals; obtaining an amplicon population having a 

sequence generated from the sample and a sequence generated from a tagged first 
strand amplification primer, by amplifying nucleic acids in the sample having a 
variant nucleotide that is a transversion flanked in the 5' direction by an invariant 
nucleotide and flanked in the 3' direction by an invariant nucleotide, wherein the 

30 tagged first strand primer is employed that comprises a 5' tag substantially incapable 
of hybridizing to target nucleic acids in the sample, and wherein the 5 5 tag contains 
the variant nucleotide with its flanking invariant nucleotides, and wherein a second 
strand amplification primer is employed; employing the amplicon population in a 
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primer extension reaction wherein the identity of the variant nucleotide in the 
sequence generated from the sample is determined by hybridizing a first identification 
primer immediately adjacent to the variant nucleotide in the sequence generated from 
the sample; hybridizing a second identification primer immediately adjacent to the 
5 variant nucleotide in the sequence generated from the amplification primer; extending 
the first and the second identification primers in the presence of one or more 
nucleotides and a polymerizing agent; determining the identity of the variant 
nucleotide generated from the sample; and comparing extension product of the first 
identification primer and extension product of the second identification primer, 
10 thereby performing the primer extension reaction. 

In another embodiment, the invention comprises a method of performing 
primer extension utilizing at least two amplification primers comprising: obtaining a 
target nucleic acid comprising a variant nucleotide flanked by an invariant nucleotide; 

1 5 hybridizing to the target nucleic acid a first amplification primer having a 5' tag 

comprising the variant nucleotide flanked by the invariant nucleotide, wherein the 5' 
tag is substantially unable to hybridize to the target nucleic acid, and a second 
amplification primer; and extending the amplification primers in the presence of at 
least one or more nucleotides and a polymerizing agent, thereby performing primer 

20 extension. 

In another embodiment, the invention comprises a composition, comprising: a 
primer having a region capable of hybridizing to a target nucleic acid wherein the 
target nucleic acid comprises a variant nucleotide and an invariant nucleotide, and 
25 wherein the primer further comprises a 5' tag region having the variant nucleotide and 
the invariant nucleotide of the target nucleic acid, and wherein the 5' tag region is 
substantially incapable of hybridizing to the target nucleic acid under conditions 
suitable for amplification of the target nucleic acid. 

30 In another embodiment, the invention comprises a method of monitoring the 

efficiency of incorporation of chain terminators into primers in a primer extension 
reaction, comprising: generating a population of amplicons from a mixed sample of 
target nucleic acid, wherein the population of amplicons comprises sequences at 
known ratios; performing primer extension reactions on the population of amplicons 
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employing chain terminators and employing a population of primers specific for the 
sequences; detecting and measuring efficiency of incorporation of chain terminators 
into the population of primers at the known ratios, thereby monitoring the efficiency 
of incorporation of chain terminators into primers in a primer extension reaction. 

5 

In yet another embodiment, the invention comprises a method of performing a 
primer extension reaction, comprising: obtaining a sample comprising target nucleic 
acid from one or more individuals; obtaining an amplicon population having a 
sequence generated from the sample and a sequence generated from a tagged first 

10 strand amplification primer, by amplifying nucleic acids in the sample having a 

variant nucleotide, wherein the tagged first strand primer is employed that comprises 
a 5' tag substantially incapable of hybridizing to target nucleic acids in the sample, 
and wherein the 5' tag contains the variant nucleotide, and wherein a second strand 
amplification primer is employed; employing the amplicon population in a primer 

1 5 extension reaction wherein the identity of the variant nucleotide in the sequence 

generated from the sample is determined by hybridizing a first identification primer 
immediately adjacent to the variant nucleotide in the sequence generated from the 
sample; hybridizing a second identification primer immediately adjacent to the variant 
nucleotide in the sequence generated from the amplification primer; extending the 

20 first and the second identification primers in the presence of one or more nucleotides 
and a polymerizing agent; determining the identity of the variant nucleotide generated 
from the sample; and comparing extension product of the first identification primer 
and extension product of the second identification primer, thereby performing the 
primer extension reaction. 

25 

The invention also comprises methods of breeding scrapie-resistant sheep. 

For a better understanding of the present invention together with other and 
further advantages and embodiments, reference is made to the following description 
taken in conjunction with the examples, the scope of which is set forth in the 
30 appended claims. 
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BRIEF DESCRIPTION OF THE FIGURES 



Preferred embodiments of the invention have been chosen for purposes of 
illustration and description, but are not intended in any way to restrict the scope of the 
5 invention. The preferred embodiments of certain aspects of the invention are shown in 
the accompanying figures, wherein: 

Figure 1 illustrates an imbalance of hetrozygotes at the scrapie locus where there are 
apparently only two haplotypes represented. 

10 

Figure 2 illustrates an imbalance of heterozygotes and where there are unique peaks 
present representing three discrete haplotypes. 

Figure 3 illustrates two hybrid amplification primers, comprising a 3' end capable of 
15 hybridizing to some target DNA, and a 5'tag sequence substantially incapable of 
hybridizing to this target DNA. 

Figure 4 illustrates a diagnostic amplification of a target nucleic acid sequence 
comprising a single nucleotide polymorphism, and the amplicon that would result. 

20 

Figure 5 illustrates the products of a 50:50 diagnostic amplification (the products 
which would result from the amplification of Figure 4) and shows how the lower or 
upper strand may be employed in a diagnostic primer extension reaction. 

25 Figure 6 illustrates three possible outcomes where the target comprises a single source 
of template DNA, employing an A/G single fluorescent base extension reaction and 
analysis by capillary electrophoresis for illustration only. 

Figure 7 illustrates how outcomes may appear in a multiplex assay of eight 
30 polymorphisms, with the mirror SNPs for each target SNP represented as the smaller 
doublet peaks. 
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Figure 8 illustrates a single tube test for a mixed sample for resolving polymorphisms 
that are transversions flanked on either side by nucleotides that are complementary to 
one another. Solid box indicates a nucleotide complementary to hatched box. 

Figure 9 illustrates a quadruplex reaction wherein the amplicon generated in Figure 8 
is interrogated by four distinct extension primers (only one of the primers that bear an 
asterisk is used) and are used to generate data from three ratios of G:C and an 
unknown polymorphic residue S. 

Figure 10 illustrates how results from a quadruplex single tube G:C (3:1 5 1:1 and 1:3) 
assay would ideally appear given that the primer extension reaction is sub-saturating 
with respect to the terminating nucleotides 

Figure 1 1 illustrates the likely shape of the graph of relative incorporation of two 
terminating nucleotides in a primer extension reaction. 

Figure 12 illustrates the assay of a transversion polymorphism with up to six 
extension primers per amplicon. 

Figure 1 3 illustrates how results of a hexaplex one-tube SNP ratio matrix might 
appear. 

Figure 14 illustrates the more likely shape of the graph of relative incorporation of 
two terminating nucleotides in a primer extension reaction. 

Figure 15 illustrates how results from a multiplex reaction assaying a plurality of 
polymorphisms would appear. 

Figure 16 illustrates a means of defining a mathematical function between the ratio of 
incorporation of one nucleotide over the other in a primer extension reaction for a 
given SNP in a specific sequence context. 

Figure 17 illustrates possible mathematical relationships between efficiency of 
incorporation of two chain terminating nucleotides. 



Figure 1 8 illustrates an application of the invention to the ovine PrP gene with two 
separate mirror SNPs carried in each of the 5 'tag sequences. 

Figure 19 illustrates output from an ovine PrP analysis if each SNP were a 
heterozygote, which is not thought to occur in nature. 

Figure 20 illustrates how the system would amplify a target nucleotide in which the 
variant nucleotide arises as the result of a deletion. 

Figure 2 1 illustrates how the system would amplify a target nucleotide in which the 
variant nucleotide arises as the result of an insertion. 

Figure 22 illustrates the design of a system in which a variant nucleotide may be 
introduced through the use of two 5' tag primers, attached to two initial amplification 
primers. 

Figure 23 illustrates an exogenous control system for a PCR reaction. 

Figure 24 illustrates a set of DNA sequences that can work efficiently as part of an 
exogenous control system for a PCR reaction. 

Figure 25 illustrates result patterns for a genotyping experiment in accordance with 
Figure 18. 

Figure 26 illustrates result patterns for a multiplex genotyping experiment in 
accordance with Figure 18. 

Figure 27 illustrates a targeted SWaP SNP result. 

Figure 28 illustrates a 310 bp amplicon produced during a scrapie assay. 

Figure 29 illustrates sequences of interest in the scrapie locus. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 



The present invention provides methods and compositions for conducting 
primer extension reactions, nucleic acid amplification reactions and polymorphism 
5 identification reactions. Further, the present invention provides methods and 

compositions that monitor high throughput multiplex detection of polymorphisms. 



Flanking 

1 0 The term flanking includes at least one or more unpaired nucleotide bases from 

the site of interest. Preferably, the one or more unpaired nucleotide bases are 
immediately adjacent to the site of interest. Most preferably, flanking means 
immediately adjacent to the site of interest. Thus, a variant nucleotide flanked on the 
5' side by an invariant nucleotide describes a sequence wherein the invariant 

1 5 nucleotide is the very next nucleotide in the sequence in the 5' direction of the variant 
nucleotide. Similarly, a variant nucleotide flanked on the 3' side by an invariant 
nucleotide describes a sequence wherein the invariant nucleotide is the very next 
nucleotide in the sequence in the 3' direction of the variant nucleotide. 

20 Variant Nucleotide 

Variant nucleotides means nucleotides that are known to vary within or 
between individuals in a population at a given locus. Preferably, a population 
includes individuals of a given genus and species. The term variant nucleotide is 
meant to include a polymorphism in a nucleotide sequence. Polymorphic sites may 

25 display a great deal of variance in the population, or may vary in only one percent or 
less of the population. Polymorphisms may be either heterozygous or homozygous 
within an individual. Homozygous individuals have identical alleles at one or more 
corresponding loci on homologous chromosomes. Heterozygous individuals have 
different alleles at one or more corresponding loci on homologous chromosomes. As 

30 used herein, alleles include an alternative form of a gene or nucleic acid sequence, 
either inside or outside the coding region of a gene, including introns, exons, and 
untranscribed or untranslated regions. Alleles of a specific gene generally occupy the 
same location on homologous chromosomes. A polymorphism is thus said to be 
allelic, in that, due to the existence of the polymorphism, some members of a species 
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carry a gene with one sequence (e.g., the original or wild-type allele), whereas other 
members may have an altered sequence (e.g., the variant or, mutant allele). In the 
simplest case, only one mutated variant of the sequence may exist, and the 
polymorphism is said to be biallelic. For example, if the two alleles at a locus are 
5 indistinguishable (for example A/ A), then the individual is said to be homozygous at 
the locus under consideration. If the two alleles at a locus are distinguishable (for 
example A/G), then the individual is said to be heterozygous at the locus under 
consideration. The vast majority of known single nucleotide polymorphisms are bi- 
allelic— where there are two alternative bases at the particular locus under 
1 0 consideration. The term individual includes an individual of any species, including 
but not limited to humans. Variant nucleotides may arise in a variety of ways, and the 
term variant nucleotide is meant to include nucleotides that vary by reason of, for 
example, mutations, insertions, deletions, frameshifts, etc. Most preferably, the 
variant nucleotide is a single nucleotide polymorphism. 

15 

Invariant Nucleotide 

Invariant nucleotides are nucleotides that do not vary among individuals of a 
given population at a given locus. Most preferably, the invariant nucleotide never 
varies between individuals of a population. Individuals of a population preferably are 
20 of the same genus and species, such as individual humans in a population of humans. 



Tags 

By the term 5 5 tag is meant a nucleotide sequence beginning at the 5' terminus 
of a primer and extending some distance in the 3' direction in the primer but is 

25 substantially incapable of hybridizing to the target nucleic acid. In the case of 

amplification primers, a 5' tag must be substantially unable to hybridize to the target 
nucleic acid under conditions sufficient to support amplification of sequences of the 
target nucleic acid. Tags can be non-complementary bases, or longer sequences that 
can be interspersed into the primer provided that the primer sequence has sufficient 

30 complementarity with the sequence of the target strand to hybridize therewith for the 
purposes employed. Preferably, the 5' tags bear little or no complementarity to the 
target nucleic acid. Most preferably, the 5' tags bear no complementarity to the target 
nucleic acid. However, apart from the 5' tags, the primers in the most preferred 
embodiment should have exact complementarity to invariant regions of the target 
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nucleic acid(s) to which they hybridize to obtain optimal results. Thus, primers 
employed in the present invention may be substantially complementary in sequence 
and be able to form a double-stranded structure or hybrid with a target nucleotide 
sequence under the particular conditions employed. The exception to this general rule 
5 is the 5' tag region of amplification primers, which must be substantially unable to 
hybridize to the target nucleic acid under amplification conditions, and the 5' tag 
region of identification primers, which should also be substantially unable to 
hybridize to the amplicon or population of amplicons so as not to interfere with 
extension of the identification primers. Where invariant sequences in a target nucleic 

10 acid adjacent to a variant nucleotide are known, methods are available to those of 
ordinary skill in the art for selecting sequences that are substantially unable to 
hybridize to those sequences such that 5' tags can be designed that do not interfere 
with either the amplification or identification reactions. Preferably, the 5' tags should 
exhibit no more than about less than 1% to about 30% complementarity to the target 

15 nucleic acid. More preferably, the 5' tags should exhibit no more than about less than 
1% to about 25% complementarity to the target nucleic acid. Most preferably, the 5' 
tags should exhibit no more than about less than 1 % to about 5% complementarity to 
the target nucleic acid. Where 5' tags are designed to contain no invariant or variant 
nucleotides of the target nucleic acids, the 5' tags can exhibit no complementarity at 

20 all to the target nucleic acid. 

Complementarity 

A nucleic acid molecule is said to be the complement of another nucleic acid 
molecule — or itself — if it exhibits complete sequence complementarity. As used 

25 herein, molecules are said to exhibit complete complementarity when every 

nucleotide of one of the molecules is able to form a base pair with a nucleotide of the 
other. Substantially complementary refers to the ability of a nucleic acid molecule to 
hybridize to another nucleic acid molecule— or with itself — with sufficient stability to 
permit annealing under at least conventional low-stringency conditions. Similarly, the 

30 molecules are said to be complementary if they can hybridize to one another with 
sufficient stability to permit them to remain annealed to one another under 
conventional high-stringency conditions. Conventional stringency conditions are 
described, for example, in Sambrook, J., et al., in Molecular Cloning, a Laboratory 
Manual, 2nd Edition, Cold Spring Harbor Press, Cold Spring Harbor, New York 
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(1989) (herein incorporated by reference). An explanation of the effect of ionic 
concentration and temperature on stringency can also be found in PCR Primer: A 
Laboratory Manual, Dieffenbach and Dveksler (Eds), Cold Spring Harbor Press, Cold 
Spring Harbor, New York (1995) (herein incorporated by reference). Departures 
5 from complete complementarity are therefore permissible, as long as such departures 
do not preclude the capacity of the molecules to form a double-stranded structure or 
hybrid. Preferably, primers should exhibit, in the region not including the 5' tag, 80 
to 100% complementarity to the target nucleic acid region they are designed to anneal 
with. More preferably, primers should exhibit, in the region not including the 5' tag, 
10 90 to 100% complementarity to the target nucleic acid sequence they are designed to 
anneal with. Most preferably, primers should exhibit, in the region not including the 
5' tag, 100% complementarity to the target to nucleic acid region they are designed to 
anneal with. 

Primer Extension 

Primer extension includes the extension of an oligonucleotide primer in a 
template-dependent manner, by one or more nucleotides. The one or more 
nucleotides can be one or more chain terminators, acylco terminators, non-chain 
terminating nucleotides and/or their analogs, and the like. Whatever the nucleotide or 
analog thereof is used, it need only be capable of being added to a primer in a 
template-dependent fashion by a polymerizing agent. Preferably, when amplification 
primers are extended, the nucleotides are all four deoxynucleotides dATP, dGTP, 
dTTP, and dCTP. Preferably, one or more labeled chain-terminators are employed 
where identification primers are extended. A preferred method of amplification is 
amplification employing thermally stable polymerizing agents, such as the 
polymerase chain reaction. Amplification conditions for employing thermally stable 
polymerases are well known in the art. 

In a preferred embodiment, following amplification, the reaction mixture is 
30 preferably prepared prior to the use of identification primers. Many methods are 
known in the art to achieve this end, such as, for example, treating the reaction 
mixture with one or more phosphatases that will inactivate any deoxynucleotides 
present in the reaction mixture; adding one or more nucleases to remove single 
stranded primers, then separating or inactivating the phosphatases and nucleases prior 
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to an identification step, and other measures known to those skilled in the art. 
Identification primers and a polymerizing agent are then added, preferably along with 
fluorescently labeled terminators, and primer extension is allowed to occur. Once the 
primer extension reaction has occurred, the products of the reaction are preferably 
5 analyzed using a capillary gel electrophoresis apparatus with a fluorescence detector. 
Such an apparatus separates the primers based on massxharge ratio, and the identity 
of the detection primer can be ascertained by inspecting the distribution of the 
extended primers by fluorescence. 

1 0 One preferred method of detecting polymorphic sites employs enzyme- 

TM 

assisted primer extension. SNP-IT (disclosed by Goelet, P. et al., and U.S. Patent 
Nos. 5,888,819 and 6,004,744, each herein incorporated by reference in its entirety) is 
a preferred method for determining the identity of a nucleotide at a predetermined 
polymorphic site in a target nucleic acid sequence. Thus, it is uniquely suited for SNP 

1 5 scoring, although it also has general applicability for determination of a wide variety 
of polymorphisms. SNP-IT™ is a method of polymorphic site interrogation in which 
the nucleotide sequence information surrounding a polymorphic site in a target 
nucleic acid sequence is used to design an oligonucleotide primer that is 
complementary to a region immediately adjacent to, but not including, the variable 

20 nucleotide(s) in the polymorphic site of the target polynucleotide. The target 

polynucleotide is isolated from a biological sample and hybridized to the interrogating 
primer. Following isolation, the target polynucleotide may be amplified by any 
suitable means prior to hybridization to the interrogating primer. The primer is 
extended by a single labeled terminator nucleotide, such as a dideoxynucleotide, using 

25 a polymerase, often in the presence of one or more chain terminating nucleoside 

triphosphate precursors (or suitable analogs). A detectable signal is thereby produced. 

Ligase/polymerase mediated genetic bit analysis (U.S. Patent Nos. 5,679,524, 
and 5,952,174, both herein incorporated by reference) is another example of a suitable 
30 polymerase mediated primer extension method for determining the identity of a 

TM 

nucleotide at a polymorphic site. Ligase/polymerase SNP-IT utilizes two primers. 
Generally, one primer is detectably labeled, while the other is designed to be affixed 

TM 

to a solid support. In alternate embodiments of ligase/polymerase SNP-IT , the 

TM 

extended nucleotide is detectably labeled. The primers in ligase/polymerase SNP-IT 

15 



are designed to hybridize to each side of a polymorphic site, such that there is a gap 
comprising the polymorphic site. Only a successful extension reaction, followed by a 
successful ligation reaction, enables production of the detectable signal. The method 
offers the advantages of producing a signal with considerably lower background than 
5 is possible by methods employing either hybridization or primer extension alone. 

An alternate method for determining the identity of a nucleotide at a 
polymorphic site in a target polynucleotide is described in Soderlund et al., U.S. 
Patent No. 6,013,431 (the entire disclosure of which is herein incorporated by 

10 reference). In this method, the nucleotide sequence surrounding a polymorphic site in 
a target nucleic acid sequence is used to design an oligonucleotide primer that is 
complementary to a region flanking the 3 ! end, with respect to the polymorphic site, of 
the target polynucleotide, but not including the variable nucleotide(s) in the 
polymorphic site of the target polynucleotide. The target polynucleotide is isolated 

15 from the biological sample and hybridized with an interrogating primer. In some 
embodiments of this method, following isolation, the target polynucleotide may be 
amplified by any suitable means prior to hybridization with the interrogating primer. 
The primer is extended, using a polymerase, often in the presence of a mixture of at 
least one labeled deoxynucleotide and one or more chain terminating nucleoside 

20 triphosphate precursors (or suitable analogs). A detectable signal is produced on the 
primer upon incorporation of the labeled deoxynucleotide into the primer. 

Once the primer extension reaction is employed, extended and unextended 
identification primers (if any) can be separated and/or discriminated from each other 

25 so as to identify the polymorphic site on the one or more alleles that are interrogated. 
Separation of nucleic acids can be performed by any methods known in the art. Some 
separation methods include the detection of DNA duplexes with intercalating dyes 
such as, for example, ethidium bromide, hybridization methods to detect specific 
sequences and/or separate or capture oligonucleotide molecules whose structures are 

30 known or unknown and hybridization methods in connection with blotting methods 
well known in the art. Hybridization methods may be combined with other separation 
technologies well known in the art, such as separation of tagged oligonucleotides 
through solid phase capture, such as, for example, tag capture arrays, capture of 
hapten-linked oligonucleotides to immunoaffinity beads, which in turn may bear 
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magnetic properties. Solid phase capture technologies also includes DNA affinity 
chromatography, wherein an oligonucleotide is captured by an immobilized 
oligonucleotide bearing a complementary sequence. Specific polynucleotide tags may 
be engineered into oligonucleotide primers, and separated by hybridization with 
5 immobilized complementary sequences. Such solid phase capture technologies also 
includes capture onto streptavidin-coated beads (magnetic or nonmagnetic) of 
biotinylated oligonucleotides. DNA may also be separated and with more traditional 
methods such as centrifugation, electrophoretic methods or precipitation or surface 
deposition methods. This is particularly applicable where the variant positions have 

1 0 been interrogate using differentially labeled fluorescent terminating nucleotides, and 
where the unextended oligonucleotides remain unlabeled and therefore essentially 
invisible to electrophoresis platforms which detect fluorescent molecules separated on 
the basis of their massxharge ratio. This is particularly so when the extended or 
unextended primers are in solution phase. The term solution includes particles 

15 suspended in a liquid medium. Solutions can be aqueous, organic, or contain both 
aqueous and organic components. 

TM 

In some embodiments of SNP-IT , the primer is bound to a solid support prior 
to the extension reaction. In other embodiments, the extension reaction is performed 
20 in solution (such as in a test tube or a micro well) and the extended product is 

TM 

subsequently bound to a solid support. In an alternate embodiment of SNP-IT , the 
primer is detectably labeled and the extended terminator nucleotide is modified so as 
to enable the extended primer product to be bound to a solid support. An example of 
this includes where the primer is fluorescently labeled and the terminator nucleotide is 
25 a biotin-labeled terminator nucleotide and the solid support is coated or derivatized 
with avidin or streptavidin. In such embodiments, an extended primer would thus be 
capable of binding to a solid support and non-extended primers would be unable to 
bind to the support, thereby producing a detectable signal dependent upon a 
successful extension reaction. 

30 

Preferably, the amplification reaction is multiplexed, where two or more or up 
to 1 00 or more polymorphic sequences are amplified simultaneously in the same 
reaction vessel. Preferably, the identification reaction is also multiplexed. Preferably, 
primer extension is carried out in the same reaction as the amplification reaction(s), 
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and preferably sequentially. Amplification reactions carried out can be treated with 
such agents as Exonuclease I and Shrimp Alkaline Phosphatase, or other physical 
treatments such as size exclusion filtration, to remove certain components of the 
amplification reaction which may otherwise interfere with or prevent the primer 
5 extension reaction from proceeding. 

Polymerizing Agent 

Polymerizing agents are agents that are capable of extending a primer in a 
template-dependent manner. Polymerizing agents may be isolated or cloned from a 
1 0 variety of organisms including viruses, bacteria, archaebacteria, fungi, mycoplasma, 
prokaryotes, and eukaryotes. Preferred polymerizing agents include polymerases. 
More preferred are polymerases that tolerate and are active at temperatures greater 
than physiological temperatures, for example, at 50°C to 70°C or are tolerant of 

TM 

temperatures of at least 90°C to about 95°C. Preferred polymerases include Taq 
15 polymerase from T aquaticus (commercially available from ABI, Foster City, CA), 
Sequenase™ and ThermoSequenase™ (commercially available from U.S. Biochemical, 
Cleveland, OH), and Exo(-) polymerase (commercially available from New England 
Biolabs, Beverley, MA). Any polymerases exhibiting thermal stability may also be 
employed, such as for example, polymerases from Thermus species, including 
20 Thermus aquaticus, Thermus brocianus, Thermus thermophilus , Thermus flavus, 
Thermococcus litoralis, and Thermogata maritime', and polymerases from the 
Pyrococcus species, including Pyrococcus furiosus, Pyrococcus sp. GB-D, and 
Pyrococcus woesei. Biologically active proteolytic fragments, recombinant 
polymerases, genetically engineered polymerizing enzymes, and modified 
25 polymerases are included in the definition of polymerizing agent. It should be 

understood that the invention can employ various types of polymerases from various 
species and origins without undue experimentation. 

Target Nucleic Acid 

30 The present invention comprises obtaining a target nucleic acid sequence 

comprising a variant nucleotide and an invariant nucleotide. The target nucleic acid 
sequence will preferably be biologically active with regard to its capacity to 
hybridize with an oligonucleotide or a polynucleotide molecule. Target nucleic acid 
sequences may be either DNA or RNA, single-stranded or double-stranded or a 
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DNA/RNA hybrid duplex. The target nucleic acid sequence may be a polynucleotide 
or oligonucleotide. Preferred target nucleic acid sequences are between 40 to about 
2000 nucleotides in length, in order to facilitate detection. Exceptionally long 
segments of target nucleic acids, up to several tens of kb, may be required under some 
5 circumstances, such as, for example, when analyzing polymorphisms in regions of 
nucleic acids which have known pseudogenes, and long amplicons are required to 
enable the selection of amplification primers specific for the gene, rather than the 
pseudogene. Large target nucleic acid sequences may be cut or fragmented into 
shorter segments by methods known in the art e.g., by mechanical or hydrodynamic 

10 shearing methods such as sonication, or by enzymatic methods such as restriction 
enzymes or nucleases. These shorter segments may then be fractionated so that 
shorter sequences bearing the variant nucleotide(s) of interest are separated from any 
redundant sequences that might otherwise participate in undesirable side reactions 
during analysis of the variant nucleotides. Methods of recovering such fractionated 

1 5 DNA are well known in the art, and include, for example, gel electrophoresis, HPLC 
and techniques that employ hybridization to a capture sequence. 

The target nucleic acid may be isolated, or derived from a biological sample. 
The term isolated as used herein refers to the state of being substantially free of other 

20 material such as non nuclear proteins, lipids, carbohydrates, or other materials such as 
cellular debris or growth media with which the target nucleic acid may be associated 
that can substantially interfere with the primer extension reactions described herein. 
The term isolated is not intended to refer to a complete absence of these materials. 
Neither is the term isolated generally intended to refer to the absence of stabilizing 

25 agents such as water, buffers, or salts, unless they are present in amounts that 

substantially interfere with the methods of the present invention. The term sample as 
used herein refers to any material that contains, or is suspected to contain, nucleic 
acid of interest, either DNA or RNA or DNA/RNA hybrids. Samples can be from any 
source including plants and animals including humans. Generally, such material will 

30 be in the form of a blood sample, a tissue sample, cells directly from individuals or 
propagated in culture, plants, yeast, fungi, mycoplasma, viruses, archaebacteria, 
histology sections, or buccal swabs, either fresh, fixed, frozen, or embedded in 
paraffin or another fixative, forensic samples, such as, for example, biological tissue, 
from a single individual or two or more individuals, alone or adhering to or mixed 
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with non-biological material. One example of a suitable sample is venous blood taken 
into a collection device with an anticoagulant such as EDTA. Such a sample is 
amenable to template preparation by, for example, alkali lysis. Other sample types 
will be amenable to assay, but may require different or more extensive template 
5 preparation such as, for example, by phenol/chloroform extraction, or capture of the 
DNA onto a silica matrix in the presence of high salt concentration, or other methods. 
Many methods are known to those of ordinary skill in the art for retrieving or 
isolating nucleic acids from a wide variety of substances. 

1 0 Preferably, the target nucleic acids are from or derived from genomic DNA 

drawn from one or more individuals, as, for example, in conducting a forensic 
investigation, a paternity test, an agricultural genotyping test or a pharmacogenetic 
assay. Pharmacogenetic applications of the present invention may be employed, for 
example, to predict or determine a phenotypic characteristic associated with the 

1 5 identity of one or more variable nucleotides in a target nucleic acid or interest derived 
from an individual's genome. Such a phenotypic characteristic may be, for example, 
an individual's susceptibility to a particular disease state, an individual's prognosis 
with regard to one or more pathologies, an individual's likely response to a 
therapeutic regimen or agent, and the like. However, target nucleic acids need not 

20 necessarily be genomic DNA. Indeed, other forms of DNA, such as, for example, 
cDNA or cDNA libraries, can be employed in the invention. Indeed, virtually any 
nucleic acid having or suspected as having a variant nucleotide, and capable of being 
amplified in a primer extension reaction, should be suitable for use in the invention. 
In a preferred embodiment, the target nucleic acids are derived from a forensic 

25 sample. 

The target nucleic acid may be, or may be derived from, either the upper or 
lower strand nucleic acids of double stranded DNA, RNA or other nucleic acid 
molecules. The upper strand of target nucleic acids includes the plus strand or sense 
30 strand of nucleic acids. The lower strand of target nucleic acids is intended to mean 
the minus or antisense strand that is complementary to the upper strand of target 
nucleic acids. Thus, reference may be made to either strand and still comprise the 
variant nucleic acid and a primer may be designed to hybridize to either or both 
strands. This is because variant nucleotides can be identified by identifying or 
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detecting the variant nucleotide itself by, for example, employing amplification and 
identification primers aimed at elucidating the identity of a variant nucleotide in, for 
example, the coding strand of a gene, or employing amplification and identification 
primers aimed at elucidating the identity of a variant nucleotide in, for example, the 
5 corresponding non-coding strand of the gene. This is due to the complementarity of 
Watson-Crick base pairing. Thus, one of ordinary skill in the art will appreciate that 
the amplification and identification primers can be designed to reveal the identity of a 
variant nucleotide or its complement, and the tag or tags of the amplification primer(s) 
can be designed accordingly, as well as the identification primers. 

10 

Target nucleic acids are not meant to be limited to sequences within coding 
regions, but may also include any region of a genome, or portion of a genome, 
containing at least one variant nucleic acid. The term genome is meant to include 
complex genomes, such as those found in animals, not excluding humans, and plants, 

1 5 as well as much simpler and smaller sources of nucleic acids, such as nucleic acids of 
viruses, viroids, and any other biological material comprising nucleic acids. One 
example of a nucleic acid sequence suitable for analysis is an amplicon from within 
the coding sequence of the ovine PrP gene, which encodes the prion protein. The 
protein product of the PrP gene has known isoforms which can be assayed as the 

20 changes in PrP gene sequence. An amplicon comprising one or more variant nucleic 
acids is a suitable template for the invention described herein. Preferably, the target 
nucleic acid comprises a single nucleotide polymorphism. 

The target nucleic acid sequences or fragment(s) thereof contain the variant 
25 nucleotide flanked by an invariant nucleotide, or include such nucleotides and 

sequences located either distal or proximal to the nucleotides. The variant nucleotides 
may be, or arise from, natural or induced mutations, deletions, insertions, re- 
arrangements, repetitive sequences, base modifications, or single or multiple base 
changes in a nucleic acid sequence. Such changes and the more prevalent, or normal, 
30 sequence may co-exist in a population. In some instances, these changes confer 
neither an advantage nor a disadvantage to the species or individuals within the 
species, and multiple alleles of the sequence may be in stable or quasi-stable 
equilibrium. In some instances, however, these sequence changes will confer a 
survival or evolutionary advantage to the species, and accordingly, an altered 
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sequence or allele may eventually over time be incorporated into the genome of many 
or most members of that species. In other instances, the altered sequence or allele 
confers a disadvantage to the species, as where the mutation causes or predisposes an 
individual to a genetic disease or defect. As used herein, the terms mutation or 
5 polymorphic site refers to one or more variant nucleotides in a given sequence 
between some members of a species, a population within a species or between 
species. Such mutations or polymorphisms include, but are not limited to, single 
nucleotide polymorphisms (SNPs), one or more base deletions, or one or more base 
insertions. 

10 

Amplicon 

An amplicon, as used herein, includes the product of a polymerase chain 
reaction wherein primers are employed in the presence of a template and one or more 
nucleotides and a template-dependent polymerizing agent to yield a nucleic acid. An 

15 amplicon product of a primer extension reaction is typically double-stranded. Where 
the amplicon is double stranded, the primers used to generate the amplicon are 
identical (that is, all upper strand primers are identical to each other and all lower 
strand primers are identical to each other), the sequences generated from the primers 
that are introduced into the amplicon are identical in each amplicon molecule of the 

20 resulting amplicon population, except for the situation where the identical primers 
amplify a region of target DNA containing a variant nucleotide which is a 
heterozygote. Where the primers used to generate an amplicon are not identical (that 
is, not all upper strand primers are identical to one another, and/or not all lower strand 
primers are identical to one another), the amplicon is a population of molecules, or 

25 population of amplicons, where the sequences generated from the primers in the 
resulting amplicon are not identical, even in situations where the non-identical 
primers amplify a region of target DNA which does not contain any variant 
nucleotide. This situation arises when employing primers with different 5' tags. The 
present invention employs this phenomenon to advantage in conducting primer 

30 extension reactions. When the term amplicon is employed herein, it is meant to refer 
to a population of individual amplicon molecules. Such a population may contain 
amplicons that are identical, substantially identical, or that are not identical, as the 
case may be. Non-identical amplicon populations are generated through employment 
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of non-identical primers, and/or the amplification of target DN A which contains a 
variant nucleotide. 

The amplicon may have primer sequences introduced into it by, for example, 
5 employing a primer with a 5' tag. Such sequences can be introduced into an amplicon 
by employing them, for example, in the 5' end of the primer, referred to herein as a 5' 
tag. Such a 5' tag may comprise sequences that are natural or man-made. Either or 
both strands of an amplicon may have such sequences, depending on whether either or 
both upper and lower amplification primers bear such tags. Further, an amplicon may 

1 0 exist as a population of amplicons generated as the result of employing primers that 
differ in the characteristics of the 5' tag. Members of such a population of amplicons 
will comprise sequences generated as the result of employing the 5' tags and of 
sequences generated as the result of employing the target nucleic acid(s) as a 
template. In an amplicon, a sequence generated by an amplification primer refers to 

15 that portion of the amplicon that contains the primer sequence, including its 5' tag 
sequence. In an amplicon, a sequence generated by the target nucleic acid of the 
sample refers to that portion of the amplicon that contains the sequence of the target 
nucleic acid that has extended beyond the 3' terminus of the primers in a template- 
dependent manner, but will exclude the portion of the 3' extension which is 

20 complementary to the opposing primer. Typically, the sequence generated by the 
target nucleic acid can be located in an amplicon by noting the 3' terminus of the 
primer sequence in one strand, noting the 3' terminus of the primer sequence in the 
other strand, and observing that the intervening sequence corresponds to the sequence 
generated by the target nucleic acid. Preferably, double stranded amplicons are 

25 denatured prior to their use as templates in primer extension reactions. 

Primers 

One primer, or two or more primers, may be employed having 5 ? tags, or 
sequences, that are substantially incapable of hybridizing to the template, or target 
30 nucleic acid, as long as the primer includes sequences that allows for sufficient 

hybridization to the template, or target, so that desired sequences in the target nucleic 
acids can be amplified. This can be achieved by employing sequences that are 
substantially incapable of hybridizing to the template in, for example, the 5' end(s) of 
the primer(s). Substantially incapable of hybridizing to a target nucleic acid means 
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unable to anneal or hybridize to the target nucleic acid and therefore cannot be 
extended in the primer extension reaction. Preferably, a 5' tag should exhibit less 
than 50% complementarity to target nucleic acid sequences in a sample or amplicon. 
More preferably, a tag should exhibit less than 10 % or 20% complementarity to 
5 target nucleic acid sequences in a sample or amplicon. Most preferably, a 5 5 tag 

should exhibit the least amount of complementarity consistent with its use, which can 
be as low as 1 % complementarity or less. 

Primers can be polynucleotides or oligonucleotides capable of being extended 

10 in a primer extension reaction at their 3' end. In order for an oligonucleotide to serve 
as a primer, it typically need only be sufficiently complementary in sequence to be 
capable of forming a double-stranded structure with the template, or target, under the 
conditions employed. Establishing such conditions typically involves selection of 
solvent and salt concentration, incubation temperatures, incubation times, assay 

1 5 reagents and stabilization factors known to those in the art. The term primer or primer 
oligonucleotide refers to an oligonucleotide as defined herein, which is capable of 
acting as a point of initiation of synthesis when employed under conditions in which 
synthesis of a primer extension product that is complementary to a nucleic acid strand 
is induced, as, for example, in a DNA replication reaction such as a PCR reaction. 

20 Like non-primer oligonucleotides, primer oligonucleotides may be labeled according 
to any technique known in the art, such as with radioactive atoms, fluorescent labels, 
enzymatic labels, proteins, haptens, antibodies, sequence tags, mass label and the like. 
Such labels may be employed by associating them, for example, with the 5' terminus 
of a primer by a plurality of techniques known in the art. Such labels may also act as 

25 capture moieties. 

As used herein, the term polynucleotide includes nucleotide polymers of any 
number. The term oligonucleotide includes a polynucleotide molecule comprising 
any number of nucleotides, preferably, less than about 200 nucleotides. More 
30 preferably, oligonucleotides are between 5 and 100 nucleotides in length. Most 

preferably, oligonucleotides are 15 to 100 nucleotides in length. The exact length of a 
particular oligonucleotide or polynucleotide, however, will depend on many factors, 
which in turn depend on its ultimate function or use. Some factors affecting the 
length of an oligonucleotide are, for example, the sequence of the oligonucleotide, the 
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assay conditions in terms of such variables as salt concentrations and temperatures 
used during the assay, and whether or not the oligonucleotide is modified at the 5' 
terminus to include additional bases for the purposes of modifying the massxharge 
ratio of the oligonucleotide, and/or providing a tag capture sequence which may be 
5 used to geographically separate an oligonucleotide to a specific hybridization location 
on a DNA chip, for example. 

Short primers may require lower temperatures to form sufficiently stable 
hybrid complexes with a template. The primers of the present invention should be 
complementary to the upper or lower strand target nucleic acids. Preferably, primers 
should not have self complementarity involving their 3' ends in order to avoid primer 
fold back leading to self-priming architectures and assay noise. Preferred primers of 
the present invention include oligonucleotides from about 8 to about 100 nucleotides 
in length, to longer polynucleotides that may be up to several thousand nucleotides 
long. 

In practice, where sequences are introduced into an amplicon/amplicon 
population, amplification primers must be sufficiently long so as to, under a given set 
of conditions, (1) be able to hybridize with sufficient specificity to the target nucleic 
20 acid to generate the amplicon, and (2) have a 5' tag long enough to introduce a 

sequence into the resulting amplicon/amplicon population so that a primer extension 
reaction can be employed with an identification primer that can selectively anneal to 
the sequence which rs at least partially the 5' tag region or is generated wholly by the 
5' tag region. Any SNP introduced in the 5 5 tag can, just like any other SNP, be 
25 analyzed on either strand of the amplicon. If the SNP is introduced by the 'forward' 
primer, then it can either be analyzed by a primer designed to hybridize to the 
extended forward primer, or it can be designed to hybridize to the daughter strand of 
the forward primer. In the first instance, the SNP could be introduced very close to 
the portion of the initial amplification primer which is complementary to the target 
30 DNA, and the primer interrogating this SNP could be substantially complementary to 
the portion of the target specific sequence. This would minimize the size of the 5' 
tag, which is desirable from both a cost and efficiency of synthesis standpoint. In the 
second instance, the SNP could again be close to the junction of the non-hybridizable 
and the hybridizable portions of the initial amplification primer. In this case however, 

25 
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the interrogating primer would hybridize to the daughter strand, and would be at least 
very similar in sequence to the sequence of the non-hybridizable portion of the 
forward primer. In order to get good specificity, this second instance would require a 
substantial 5 5 tag sequence. See Figure 18. 

5 

Primers of about 10 nucleotides are the shortest sequence that can be used to 
selectively hybridize to a complementary target nucleic acid sequence against the 
background of non-target nucleic acids in the present state of the art, although short 
sequences such as this will have greater potential to hybridize perfectly with multiple 

1 0 sites in a complex genome such as the greater than 3 billion base pair human genome. 
Therefore the size and complexity of total target DNA must be considered in order to 
design primers which will hybridize to just the target site intended. Most preferably, 
sequences of unbroken complementarity over at least 20 to about 35 nucleotides are 
used to assure a sufficient level of hybridization specificity, although length may vary 

1 5 considerably given the sequence of the target DNA molecule. The primers of this 
invention must be capable of specifically hybridizing, or annealing, to the target 
nucleic acid sequence — such as, for example, one or more upper primers hybridizing 
to one or more upper strand target nucleic acids or one or more lower strand nucleic 
acids. As used herein, two nucleic acid sequences are said to be capable of 

20 hybridizing to one another if the two molecules are capable of forming an anti- 
parallel, double-stranded nucleic acid structure or hybrid under conditions sufficient 
to promote such hybridization, whereas they must be substantially unable to form a 
double-stranded structure or hybrid with one another when incubated with a non- 
target nucleic acid sequence under the same conditions. 

25 

Detection 

In yet another embodiment, the first and the second identification primers bear 
a detectable characteristic. The detectable characteristic may be the same or different 
on the first and second identification primers. The detectable characteristic may be a 
30 characteristic selected from the group consisting of mass, apparent mass, molecular 
weight, apparent molecular weight, a combination or ratio of mass and charge, 
number of bases, magnetic resonance, spectrophotometry, fluorometry, electric 
charge, polarimetry, light scattering, luminescence and antigen-antibody interaction. 
The identification primers can be modified by methods known by those of ordinary 
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skill in the art to bear such characteristics. Preferably, the detectable characteristic 
comprises a capture tag. Primers tagged with capture tags can be applied to, for 
example, an array, an addressable array, or a virtual array, and the identity of the 
primer can be determined by its binding to such an array. Capture tags are meant to 
5 include nucleotide sequence tags, wherein capturing elements comprise the 
complements of the nucleotide sequence tags. Most preferably, the detectable 
characteristic is a change in massxharge ratio induced by differential numbers of 
nucleotides in 5' tags, such that the primers are separable by capillary gel 
electrophoresis. 

10 

The primers of the present invention may be labeled at the 5' end. In a 
preferred embodiment, the identification primers are labeled at the 5' end. Labels 
include any label such as radioactive labels, fluorescent labels, enzymatic labels, 
proteins, haptens, antibodies, sequence tags, and the like. Preferably, the label does 

1 5 not interfere with the processes of the present invention. A preferred label includes a 
distinct nucleotide sequence that is complementary to a sequence bound to a solid 
support, where such solid support may include an array, including an addressable 
array or a virtual array. Thus, when the primer is exposed to the solid support under 
suitable hybridization conditions, the label hybridizes with the complementary 

20 sequence bound to the solid support. In this way, the identity of the primer can be 
determined by geometric location on the array, or by other means of identifying the 
point of association of the label with the capture moiety. 

Most preferably, primer extension products of the identification primers are 
25 separated and identified by capillary gel electrophoreses wherein a fluorescence 

detector is employed to identify primer extension products labeled with fluorescent 
terminating nucleotides. In this most preferred embodiment, extended primers 
bearing fluorescent labels are separated by their massxharge ratio. However, many 
separation and detection methods are known to those skilled in the art, and the 
30 invention herein is amenable to a wide variety of detection and separation protocols 
once this disclosure is in the hands of one skilled in the art. A primary advantage of 
the invention is the variety of detectable characteristics and tags that may be placed on 
the identification primers to aid in their separation and/or detection. Indeed, in the 
absence of tags, the primers of the invention may be separated, detected, and/or 
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identified by their inherent physical characteristics or behavior, as is known to those 
skilled in the art. 

The term detection refers to identification of a detectable moiety or moieties. 
5 The term is intended to include the ability to identify a moiety by electromagnetic 
characteristics, such as, for example, charge, light, fluorescence, chemiluminescence, 
changes in electromagnetic characteristics such as, for example, fluorescence 
polarization, light polarization, dichroism, light scattering, changes in refractive 
index, reflection, infrared, ultraviolet, and visible spectra, mass, massxharge ratio and 

10 all manner of detection technologies dependent upon electromagnetic radiation or 
changes in electromagnetic radiation. The term is also intended to include 
identification of a moiety based on binding affinity, intrinsic mass, mass deposition, 
and electrostatic properties, size and sequence length. It should be noted that 
characteristics such as mass and molecular weight may be estimated by apparent mass 

1 5 or apparent molecular weight, so the terms mass or molecular weight as used herein 
do not exclude estimations as determined by a variety of instrumentation and 
methods, and thus do not restrict these terms to any single absolute value without 
reference to the method or instrumentation used to arrive at the mass or molecular 
weight. 

20 

Another method of detecting the nucleotide present at the polymorphic site is 
by comparison of the concentrations of free, unincorporated nucleotides remaining in 
the reaction mixture at any point after the primer extension reaction. Mass 
spectroscopy in general and, for example, electrospray mass spectroscopy, may be 

25 employed for the detection of unincorporated nucleotides in this embodiment. This 
detection method is possible because only the nucleotide(s) complementary to the 
polymorphic base is (are) depleted in the reaction mixture during the primer extension 
reaction. Thus, mass spectrometry may be employed to compare the relative 
intensities of the mass peaks for the nucleotides, Likewise, the concentrations of 

30 unlabeled primers may be determined and the information employed to arrive at the 
identity of the nucleotide present at the polymorphic site. 
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Support/ Array 

Preferred arrays for the present invention include, but are not limited to, 
addressable arrays including an array as defined above wherein individual positions 
have known coordinates such that a signal at a given position on an array may be 
5 identified as having a particular identifiable characteristic. Such arrays are commonly 
referred to as chips, biochips, biochip arrays, DNA chips, RNA chips, nucleotide 
chips, and oligonucleotide chips. Array, as used herein, is intended to include arrays 
in any shape or configuration, 2-dimensional arrays, and 3-dimensional arrays. 

10 One particularly preferred array is the GenFlex™ Tag Array, from 

Affymetrix, Inc., that is comprised of capture probes for 2000 tag sequences. These 
are 20mers selected from all possible 20mers to have similar hybridization 
characteristics and at least minimal homology to sequences in the public databases. 

1 5 Preferred separation methods employ exposing any extended and unextended 

primers to a solid support. Solid supports include arrays. The term array is used 
herein to refer to an ordered arrangement of immobilized biological molecules at a 
plurality of positions on a solid, semi-solid, gel or polymer phase. This definition 
includes phases treated or coated with silica, silane, silicon, silicates and derivatives 

20 thereof, plastics and derivatives thereof such as, for example, polystyrene, nylon and, 
in particular, polystyrene plates, glasses and derivatives thereof, including derivatized 
glass, glass beads, controlled pore glass (CPG). Immobilized biological molecules 
includes oligonucleotides that may include other moieties, such as tags and/or affinity 
moieties. The term array is intended to include and be synonymous with the terms 

25 chip, biochip, biochip array, DNA chip, RNA chip, nucleotide chip, and 

oligonucleotide chip. All these terms are intended to include arrays of arrays, and are 
intended to include arrays of biological polymers such as, for example, 
oligonucleotides and DNA molecules whose sequences are known or whose 
sequences are not known. 

30 

Transversion 

By the term transversion is meant a variant nucleotide in a nucleotide sequence, 
wherein the variance is the occurrence of a purine in the place of a pyrimidine, or a 
pyrimidine in the place of a purine. It will be appreciated by one of skill in the art that 
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normal transitions can also be used in these assays, although they will not regenerate 
the SNP in opposite ratio on the other strand, and so are not preferred for forensic 
applications. However, diagnostic, single template source applications only require 
that we know what a 1 : 1 heterozygote looks like upon performing the primer 
5 extension reaction. 

Nucleotide 

The primer extension reaction of the present invention employs a mixture of 
one or more nucleotides, labeled or not, and a polymerizing agent. The term 

10 nucleotide or nucleic acid as used herein is intended to refer to ribonucleotides, 

deoxyribonucleotides, acyclic derivatives of nucleotides, and functional equivalents or 
derivatives thereof, of any phosphorylation state capable of being added to a primer 
by a polymerizing agent. Functional equivalents of nucleotides are those that act as 
substrates for a polymerase as, for example, in an amplification method or a primer 

15 extension method. Functional equivalents of nucleotides are also those that may be 
formed into a polynucleotide that retains the ability to hybridize in a sequence- 
specific manner to a target polynucleotide. Examples of nucleotides include chain- 
terminating nucleotides, most preferably dideoxynucleoside triphosphates (ddNTPs), 
such as ddATP, ddCTP, ddGTP, and ddTTP; however other terminators known to 

20 those skilled in the art, such as, for example, acyclo nucleotide analogs , other acyclo 
analogs, and arabinoside triphosphates, are also within the scope of the present 
invention. Preferred ddNTPs differ from conventional 2'deoxynucleoside 
triphosphates (dNTPs) in that they lack a hydroxyl group at the 3 'position of the sugar 
component. 

25 

The nucleotides employed may bear a detectable characteristic. As used 
herein a detectable characteristic includes any identifiable characteristic that enables 
distinction between nucleotides. It is important that the detectable characteristic does 
not interfere with any of the methods of the present invention. Detectable 
30 characteristic refers to an atom or molecule or portion of a molecule that is capable of 
being detected employing an appropriate method of detection. Detectable 
characteristics include inherent mass, electric charge, electron spin, mass tag, 
radioactive isotope, dye, bioluminescence, chemiluminescence, nucleic acid 
characteristics, haptens, proteins, light scattering/phase shifting characteristics, or 
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fluorescent characteristics. As used herein, the phrase "same detectable 
characteristic" includes nucleotides that are detectable because they have the same 
signal. The same detectable characteristic includes embodiments where nucleotides 
are labeled with the same type of labels, for example, A and C nucleotide may be 
labeled with the same type of dye, where they emit the same type of signal. 

Nucleotides and primers may be labeled according to any technique known in 
the art. Preferred labels include radiolabels, fluorescent labels, enzymatic labels, 
proteins, haptens, antibodies, sequence tags, mass tags, fluorescent tags and the like. 
Preferred dye type labels include, but are not limited to, TAMRA (carboxy- 
tetramethylrhodamine), ROX (carboxy-X-rhodamine), FAM (5 -carboxy fluorescein), 
and the like. 

The primer extension reactions of the present invention can employ one or 
more labeled nucleotide bases. Preferably, two or more nucleotides of different bases 
are employed in the identification step. Most preferably, the identification reaction of 
the present invention employs four nucleotides of different bases. In the most 
preferred embodiment all four different types of nucleotide are labeled with 
distinguishable labels. For example, A labeled with dR6G, C labeled with dTAMRA, 
G labeled with dRl 10 and T labeled with dROX. 

Nucleotides may also be detected by, or labeled with moieties that can be 
detected by, a variety of spectroscopic methods relating to the behavior of 
electromagnetic radiation. These spectroscopic methods include, for example, 
electron spin resonance, optical activity or rotation spectroscopy such as circular 
dichroism spectroscopy, fluorescence, fluorescence polarization, absorption/emission 
spectroscopy, ultraviolet, infrared, visible or mass spectroscopy, Raman spectroscopy 
and nuclear magnetic resonance spectroscopy. 

Locus 

The term "locus" includes a discrete region of a nucleic acid, such as DNA, 
that can be as few as a single base or as large as several hundred thousand bases in 
length. "Scrapie locus" includes the PrP gene, including a 310 base pair section of the 
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coding region of the PrP gene that harbors four single nucleotide polymorphisms that 
have been associated with susceptibility to scrapie. 

Complex Genotype 

5 The phrase "complex genotype" includes genotypes where a sample contains 

three or more representations of a locus, such as, for example, through some 
duplication event increasing the number of copies of the said locus, or nucleic acids 
from two or more individuals combined into a single nucleic acid sample, yielding a 
sample that appears to contain three or more representations of any given locus, 

10 Complex genotypes can result from samples from individuals exhibiting polyploidy; 
the occurrence of additional chromosomes in a sample, such as, for example, 
individuals exhibiting trisomy; gene duplications in a sample; when analyzing 
samples from transgenic animals; and when the sample is derived from an individual 
that has experienced placental anastomosis, and other phenomena. The phrase 

15 "placental anastomosis" includes conditions where blood, cells or tissue of the fetuses 
mix, to either a small or large degree. Individuals with mixtures of more than one cell 
type, cell populations that exhibit an increase of a particular genetic element, and 
mixed samples of nucleic acids from two or more individuals may all exhibit an 
apparently complex genotype. Such complex genotypes are commonly observed in 

20 forensic investigations in which DNA has been isolated from more than one 

individual or species, or in circumstances where there is an over-representation of a 
particular genetic element within a single source cell type, such as trisomy 21. 
Organisms that are a mixture of more than one cell type, such as a chimera, may also 
yield a complex genotype when DNA is recovered from the mixed cell population and 

25 analyzed as if the DNA had originated from a single cell population. 

In a preferred embodiment of the present invention, the target nucleic acid 
sequences are arranged in a format that allows multiple simultaneous detections 
(multiplexing), as well as parallel processing using oligonucleotide arrays. 

30 

Preferred applications of the specialized primers and methods taught herein 
include diagnostic polymorphism genotyping from a single source template, that is, 
from non-mixed sources. The primers and methods taught herein are applicable, for 
example, to any single nucleotide polymorphism in any sequence context. In such an 
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embodiment, most preferably a control 1:1 heterozygotic site is introduced into the 
amplicon employing the primers described herein. In this embodiment, assay of the 
control heterozygotic site confirms amplicon generation and serves as a benchmark 
for analysis of the polymorphism of the target nucleic acid being investigated. 

5 

Preferred applications of the specialized primers and methods taught herein 
also include forensic single nucleotide polymorphism genotyping from mixed 
template sources. In such an embodiment, single nucleotide polymorphisms that are 
transversions are most preferred, wherein the nucleotide directly upstream of the 

10 transversion is complementary to the nucleotide directly downstream of the 

transversion. In such an embodiment, at least two inventive primers are employed. 
The first primer of this embodiment has a ratio of variant nucleotide X/Y of X:Y = 
3:1, which reverses on the daughter strand to X:Y = 1 :3. The second primer of this 
embodiment has a balanced ratio of variant nucleotide X/Y of X: Y = 1:1, generating a 

15 heterozygous site. 

Transversions particularly suited for forensic usage are those that are flanked 
by short DNA sequences of DNA that are palindromic in nature. That is to say, the 
bases immediately 5' to the transversion on each strand are the same, and the bases 

20 immediately 3' to the transversion site are the same on each strand, and that on a 

given strand, the base immediately 5' adjacent to the transversion is the complement 
of the base immediately 3' adjacent to the transversion. This complementarity may 
extend to the bases at -2 and +2, and so forth to a limited number of bases. These 
transversion may be termed SWaP SNPs, meaning that they are either G/C (=S) or 

25 A/T (=W) transversions which are located amid palindromes (SWaP = S,W amid 
Palindromes), and they have the characteristic that any asymmetry in the 
representation of G/C or A/T is reversed to be the inverse asymmetry on the other 
strand, without affecting the type of SNP or the flanking DNA sequence around the 
SNP. 

30 

One embodiment of the invention comprises a method of performing a primer 
extension reaction, comprising: obtaining an amplicon having a sequence generated 
from a target nucleic acid and a sequence generated from a first strand amplification 
primer, by amplifying a target nucleic acid having a variant nucleotide flanked by an 
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invariant nucleotide, wherein a first strand amplification primer is employed that 
comprises a 5' tag substantially incapable of hybridizing to the target nucleic acid 
under amplification conditions, and wherein the 5' tag contains the variant nucleotide 
of the target nucleic acid, and employing a second strand amplification primer; 
5 employing the amplicon in a primer extension reaction wherein the identity of the 
variant nucleotide in the sequence generated from the target nucleic acid is 
determined by hybridizing a first identification primer immediately adjacent to the 
variant nucleotide in the sequence generated from the target nucleic acid; hybridizing 
a second identification primer immediately adjacent to the variant nucleotide in the 

10 sequence generated from the amplification primers; extending the first and the second 
identification primers in the presence of one or more nucleotides and a polymerizing 
agent; determining the identity of the variant nucleotide generated from the target 
nucleic acid; and comparing extension product of the first identification primer and 
extension product of the second identification primer, thereby performing the primer 

1 5 extension reaction. 

In another embodiment of the invention, immediately adjacent in the 5' 
direction to the variant nucleotide in the 5 'tag is the invariant nucleotide to the 5' 
direction of the variant nucleotide of the target nucleic acid. By immediately adjacent 
20 in the 5' direction is meant the next nucleotide in the 5' direction from the variant 
nucleotide. Thus, in this embodiment, the 5' tag comprises the variant nucleotide of 
the target nucleic acid and the next nucleotide in the 5' direction of the variant 
nucleotide, arranged as in the target nucleic acid. 

25 In yet another embodiment of the invention, immediately adjacent in the 3' 

direction to the variant nucleotide in the 5'tag is the invariant nucleotide to the 3' 
direction of the variant nucleotide of the target nucleic acid. By immediately adjacent 
in the 3' direction is meant the next nucleotide in the 3' direction from the variant 
nucleotide in the target nucleic acid. Thus, the 5' tag can comprise the variant 

30 nucleotide of the target nucleic acid and the next nucleotide in the 3' direction of the 
variant nucleotide, arranged as in the target nucleic acid. In another embodiment of 
the invention, immediately adjacent in the 3' direction to the variant nucleotide in the 
5'tag is the invariant nucleotide to the 3' direction of the variant nucleotide of the 
target nucleic acid, and immediately adjacent in the 5' direction to the variant 
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nucleotide in the 5 'tag is the invariant nucleotide to the 5 ? direction of the variant 
nucleotide of the target nucleic acid. Thus, the 5' tag can comprise the variant 
nucleotide of the target nucleic acid and both the next nucleotide in the 3' direction of 
the variant nucleotide and the next nucleotide in the 5' direction of the variant 
5 nucleotide, arranged as in the target nucleic acid. Further, the 5' tag can comprise at 
least two invariant nucleotides immediately adjacent in the 3 5 direction to the variant 
nucleotide, and at least two invariant nucleotides immediately adjacent in the 5 5 
direction to the variant nucleotide, and wherein the at least two invariant nucleotides 
immediately adjacent in the 3' direction and the at least two invariant nucleotides 

10 immediately adjacent in the 5' direction are selected so as to be substantially 

homologous to the corresponding nucleotides flanking the variant nucleotide in the 
target nucleic acid. Preferably, the corresponding nucleotides flanking the variant 
nucleotide in the target nucleic acid should be similar to the flanking nucleotides in 
the 5' tag so as to present the same or similar sequence context as is present around 

15 the variant nucleotide in the target nucleic acid, with respect to the effect these 

flanking nucleotides would have on the incorporation of the variant nucleotide in a 
primer extension reaction. Preferably, substantially homologous is 80% or more 
homology. More preferably, substantially homologous is 90% or more homology. 
Most preferably, substantially homologous is 99% or more homology. 

20 

In another embodiment of the invention, the identification primers are extended 
by one or more labeled nucleotide bases, and are capable of being detected by a 
characteristic selected from the group consisting of mass, apparent mass, molecular 
weight, apparent molecular weight, a combination or ratio of mass and charge, 
25 number of bases, magnetic resonance, spectrophotometry, fluorometry, electric 

charge, polarimetry, light scattering, luminescence and antigen-antibody interaction. 

In another embodiment of the invention, the identification primers are extended 
by a chain terminator. Chain terminators may be dideoxynucleotides, acyclo 
30 terminators, and the like. The chain terminators may be labeled such that the resulting 
extended primers are detectable by characteristics such as mass, apparent mass, 
molecular weight, apparent molecular weight, a combination or ratio of mass and 
charge, number of bases, magnetic resonance, spectrophotometry, fluorometry, 
electric charge, polarimetry, light scattering, luminescence and antigen-antibody 
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interaction. Preferably, the chain terminators are labeled with fluorescent or 
fluorogenic moieties, allowing their detection with, for example, conventional 
fluorescence detection instrumentation coupled to capillary electrophoresis 
apparatuses. 

5 

In another embodiment, identification primers are applied to virtual arrays 
where extended and unextended primers are separated on an array where the array 
comprises a suspension of microspheres, where the microspheres bear one or more 
capture moieties to separate the tagged primers. The microspheres, in turn, bear 
10 unique identifying characteristics such that they are capable of being separated on the 
basis of that characteristic, such as for example, diameter, density, size, color, and the 
like. 

Another preferred array is the addressable array that has sequence tags that 
15 complement 5' tags of nucleic acids, such as primers, to be analyzed. These 

complementary tags are bound to the array at known positions. This type of tag 
hybridizes with the array under suitable hybridization conditions. By locating the 
bound primer in conjunction with detecting one or more extended primers, the 
nucleotide identity at the polymorphic site can be determined. 

20 

In another embodiment, the invention comprises varying the identity of the 
variant nucleotide in the 5' tag so as to generate a population of amplicons in which 
the identity of the variant nucleotide derived from the 5' tags is fixed at a known ratio. 
By varying the identity of the variant nucleotide in the 5' tag is meant employing 

25 primers where the identity of the nucleotide at the position of the variant nucleotide is 
not the same in all primer molecules. Thus, all primers may bear the invariant 
nucleotide flanked by either the variant nucleotide or another nucleotide that is not the 
invariant nucleotide. For example, where the variant nucleotide is a G/C SNP, then a 
primer population may be used that bears a G at the variant nucleotide site in one-half 

30 of the primer molecules, whereas the remaining half of the primer molecules bear a C 
at the variant nucleotide site. Amplification employing a target nucleic acid having 
such a G/C SNP will generate a population of amplicons wherein all will have the 
naturally occurring G/C SNP, but one-half will bear a G at the variant nucleotide site 
generated by the 5' tag sequence and the remaining half will bear a C at the variant 
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nucleotide site generated by the 5' tag sequence. This population of amplicons can 
then be probed with identification primers that are specific for the variant site 
generated from the 5' tags and that are specific for the naturally occurring variant site. 
Thus, the variant nucleotide can be reproduced in the amplicon at a known ratio. In a 
5 preferred embodiment of the invention, the identity of the variant nucleotide in the 5' 
tag is varied so as to generate a population of amplicons that is a balanced 
heterozygous population with respect to the variant nucleotide. A balanced 
heterozygote is a mixture of DNA species in which there are equivalent 
concentrations of two distinct DNA sequences. In terms of the present invention, the 
10 use of equivalent concentrations of the 5' tag primers bearing equivalent 

concentrations of the variant nucleotides will result in an amplicon population in 
which there are two species with respect to the sequences derived from the 5' tag 
portions of the amplification primers. Such equivalence of distinct DNA sequences 
may be said to represent a balanced heterozygote. 

15 

In a preferred embodiment of the invention, the target nucleic acid comprises 
nucleic acids from two or more individuals. By two or more individuals is meant two 
or more biological entities that comprise nucleic acids. For example, the target 
nucleic acid may be a forensic sample, comprising nucleic acids from the victim of a 
20 crime and nucleic acids from one or more other individuals. The term individual is 
meant to include members of any species that harbors nucleic acids, and is not meant 
to be limited only to humans. Indeed, the sample may comprise nucleic acids from 
two or more different species or two or more individuals of different genus. 

25 In another embodiment of the invention, two or more variant nucleotides are 

identified. The variant nucleotides may be on the same nucleic acid molecule, or 
target nucleic acid, or may be on separate nucleic acid molecules, or target nucleic 
acids. Preferably, the two or more variant nucleotides are on the same target nucleic 
acid molecule. Most preferably, the two or more variant nucleic acids are situated 

30 such that the invention can be practiced wherein they appear on the same amplicon 
molecule. 

In another embodiment, the invention comprises a method of performing a 
primer extension reaction, comprising: obtaining a sample comprising target nucleic 
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acid from one or more individuals; obtaining an amplicon population having a 
sequence generated from the sample and a sequence generated from a tagged first 
strand amplification primer, by amplifying nucleic acids in the sample having a 
variant nucleotide that is a transversion flanked in the 5' direction by an invariant 
5 nucleotide and flanked in the 3' direction by an invariant nucleotide, wherein the 
tagged first strand primer is employed that comprises a 5' tag substantially incapable 
of hybridizing to target nucleic acids in the sample, and wherein the 5' tag contains 
the variant nucleotide with its flanking invariant nucleotides, and wherein a second 
strand amplification primer is employed; employing the amplicon population in a 

10 primer extension reaction wherein the identity of the variant nucleotide in the 

sequence generated from the sample is determined by hybridizing a first identification 
primer immediately adjacent to the variant nucleotide in the sequence generated from 
the sample; hybridizing a second identification primer immediately adjacent to the 
variant nucleotide in the sequence generated from the amplification primer; extending 

1 5 the first and the second identification primers in the presence of one or more 
nucleotides and a polymerizing agent; determining the identity of the variant 
nucleotide generated from the sample; and comparing extension product of the first 
identification primer and extension product of the second identification primer, 
thereby performing the primer extension reaction. 

20 

In another embodiment of the invention, wherein the flanking invariant 
nucleotide in the 5' direction of the transversion is complementary to the flanking 
invariant nucleotide in the 3' direction of the transversion. 



25 In another embodiment, the first strand amplification primer comprises the two 

or more nucleotides in the 5 J direction immediately adjacent to the variant nucleotide 
of the first strand amplification primer, wherein the two or more nucleotides are 
identical to the two or more nucleotides immediately adjacent in the 5' direction of 
the variant nucleotide in the target. Thus, an identification primer employed to 

30 determine the variant nucleotide in the sequence generated by the 5' tag will employ 
the same two 3' terminal nucleotides as are present in the identification primer. In 
another embodiment of the invention, the first strand amplification primer comprises 
the two or more nucleotides in the 3' direction immediately adjacent to the variant 
nucleotide of the first strand amplification primer, wherein the two or more 
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nucleotides are identical to the two or more nucleotides immediately adjacent in the V 
direction of the variant nucleotide in the target nucleic acid. In another embodiment 
of the invention, the first strand amplification primer comprises the two or more 
nucleotides in the 5' direction immediately adjacent to the variant nucleotide of the 
5 first strand amplification primer, and the two or more nucleotides in the 3' direction 
immediately adjacent to the variant nucleotide of the first strand amplification primer, 
each arranged as to be identical to the corresponding nucleotides flanking the variant 
nucleotide in the target nucleic acid. Inclusion of these invariant nucleotides around 
the variant nucleotide in the 5' tag will generate an amplicon, or population of 
1 0 amplicons, having a similar sequence context in the vicinity of the variant nucleic acid 
present in the target DNA, thus reducing DNA sequence context-sensitive effects that 
might otherwise confound results on primer extension with identification primers. 

In another embodiment of the invention, the second strand amplification primer 
15 comprises a 5' tag having the variant nucleotide. The second strand amplification 

primer can also have the variant nucleotide, which can also be flanked, in either the 5' 
direction or the 3' direction or both, by the invariant nucleotide(s) flanking the variant 
nucleotide in the target. In the event that the variant nucleotide is a transversion, the 
first variant nucleotide in the 5' direction is complementary to the first variant 
20 nucleotide in the 3' direction flanking the transversion, and it is most preferred that 
both variant nucleotides be included in the 5' tag of the first strand amplification 
primer and in the 5' tag of the second strand amplification primer. 

In another embodiment of the invention, the identity of the variant nucleotide in 
25 the first and second strand amplification primers is varied so as to generate a 

population of amplicons wherein the identity of the variant nucleotide is varied at a 
known ratio. Preferably, the identity of the variant nucleotide in the 5' tag of the first 
and second strand amplification primers is varied so as to generate an amplicon 
population comprising a ratio of one to one (1:1) and a ratio of three to one (3:1) with 
30 respect to the identity of the nucleotides in the amplicon population generated by the 
5' tags. The resulting amplicon will, for example, have a ratio of 3: 1 and 1 : 1 in the 
upper strand sequence generated by the tags, and a ratio of 1 :3 and 1 : 1 in the lower 
strand sequence generated by the tags, in addition to the variant nucleotide generated 
from the target nucleic acid. Employment of identification primers in a primer 
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extension reaction will thus generate signals corresponding to the molar ratios stated 
above, and will be available for comparison to the identification primer result for the 
variant nucleotide generated from the target nucleic acid. 

In another embodiment, the first and the second identification primers bear a 
detectable characteristic. The detectable characteristic of the first identification 
primer may be the same or different from the detectable characteristic of the second 
identification primer. 

In another embodiment of the invention, two or more variant nucleotides are 
identified. One, both, or two or more of the variant nucleotides can be transversions. 
The two or more variant nucleotides can be on the same nucleic acid molecule, or 
they can be on different nucleic acid molecules. 

In another embodiment of the invention, the variant nucleotide is a 
transversion, and the identification primers are extended by one or more labeled 
nucleotide bases, and are capable of being detected by a characteristic selected from 
the group consisting of mass, apparent mass, molecular weight, apparent molecular 
weight, a combination or ratio of mass and charge, number of bases, magnetic 
resonance, spectrophotometry, fluorometry, electric charge, polarimetry, light 
scattering, luminescence and antigen-antibody interaction. 

In another embodiment of the invention, the variant nucleotide is a 
transversion, and the identification primers are extended by a chain terminator. The 
chain terminator may be a dideoxynucleotide or an acyclo terminator. The chain 
terminator can be labeled with a detectable moiety. Most preferably, the chain 
terminator is labeled such that it can be detected with a fluorescence detector. 

In another embodiment, the variant nucleotide comprises a transversion and the 
identification primers comprise a tag capture moiety. The identification primers with 
tag capture moieties may be captured on an array. The array may be an addressable 
array or a virtual array. 
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In another embodiment of the invention, the variable nucleotide is a 
transversion and the second strand amplification primer comprises a 5' tag having the 
same variant nucleotide, the same invariant nucleotide flanked in the 5' direction, and 
the same invariant nucleotide flanked in the 3' direction as the first strand 
amplification primer, and wherein the first strand amplification primer reflects a 
transversion ratio of 1 : 1 in the variant nucleotide and wherein the second strand 
amplification primer reflects a transversion ratio of 1 :3 in the variant nucleotide, and 
wherein at least three identification primers are employed in the primer extension 
reaction. 

In a preferred embodiment, the individuals are sheep. In another embodiment, 
preferably at least one of the one or more individuals displays at least one complex 
genotype. Preferably, the target nucleic acid comprises the PrP locus. 

In another embodiment, the invention comprises a method of performing 
primer extension utilizing at least two amplification primers comprising: obtaining a 
target nucleic acid comprising a variant nucleotide flanked by an invariant nucleotide; 
hybridizing to the target nucleic acid a first amplification primer having a 5' tag 
comprising the variant nucleotide flanked by the invariant nucleotide, wherein the 5' 
tag is substantially unable to hybridize to the target nucleic acid, and a second 
amplification primer; and extending the amplification primers in the presence of at 
least one or more nucleotides and a polymerizing agent, thereby performing primer 
extension. 

In another embodiment, the invention comprises a composition, comprising: a 
primer having a region capable of hybridizing to a target nucleic acid wherein the 
target nucleic acid comprises a variant nucleotide and an invariant nucleotide, and 
wherein the primer further comprises a 5' tag region having the variant nucleotide and 
the invariant nucleotide of the target nucleic acid, and wherein the 5' tag region is 
substantially incapable of hybridizing to the target nucleic acid under conditions 
suitable for amplification of the target nucleic acid. Conditions sufficient to achieve 
amplification are well known in the art and have been illustratively described or 
incorporated by reference herein. Such conditions include protocols for amplification 
of target nucleic acids by thermally stable polymerizing agents. 
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In another embodiment, the invention comprises a method of monitoring the 
efficiency of incorporation of chain terminators into primers in a primer extension 
reaction, comprising: generating a population of amplicons from a mixed sample of 
5 target nucleic acid, wherein the population of amplicons comprises 5' tag variant 
sequences at known ratios and target-derived variant sequences at unknown ratios; 
performing primer extension reactions on the population of amplicons employing 
chain terminators and employing a population of distinguishable primers specific for 
the variant sequences present in the 5'tag derived sequences and the target-derived 

10 variant sequences; detecting and measuring efficiency of incorporation of chain 

terminators into the population of primers at the known ratios, thereby monitoring the 
efficiency of incorporation of chain terminators into primers in a primer extension 
reaction against the 5' tag derived variant sequences; detecting and measuring 
efficiency of incorporation of chain terminators into the population of primers at the 

1 5 unknown ratios, thereby measuring the rate of incorporation of chain terminators into 
primers in a primer extension reaction against the target-derived variant sequences. 
The phrase "mixed sample" includes samples comprising nucleic acids from two or 
more individuals. By a population of primers specific for the known sequences is 
meant a population of identification primers. 

20 

In yet another embodiment, the invention comprises a method of performing a 
primer extension reaction, comprising: obtaining a sample comprising target nucleic 
acid from one or more individuals; obtaining an amplicon population having a 
sequence generated from the sample and a sequence generated from a tagged first 

25 strand amplification primer, by amplifying nucleic acids in the sample having a 

variant nucleotide, wherein the tagged first strand primer is employed that comprises 
a 5' tag substantially incapable of hybridizing to target nucleic acids in the sample, 
and wherein the 5' tag contains the variant nucleotide, and wherein a second strand 
amplification primer is employed; employing the amplicon population in a primer 

30 extension reaction wherein the identity of the variant nucleotide in the sequence 

generated from the sample is determined by hybridizing a first identification primer 
immediately adjacent to the variant nucleotide in the sequence generated from the 
sample; hybridizing a second identification primer immediately adjacent to the variant 
nucleotide in the sequence generated from the amplification primer; extending the 

42 



first and the second identification primers in the presence of one or more nucleotides 
and a polymerizing agent; determining the identity of the variant nucleotide generated 
from the sample; and comparing extension product of the first identification primer 
and extension product of the second identification primer, thereby performing the 
5 primer extension reaction. The variant nucleotide may represent a transversion and 
may arise due to, for example, an insertion, deletion, rearrangement, or by any other 
way that variability is introduced into a nucleic acid sequence either naturally or 
synthetically. 

In yet another embodiment, the invention provides a method of screening 
animals for susceptibility to a disease or disorder, comprising: determining the 
identity of polymorphic nucleotides at three or more alleles at a locus; and employing 
the identities of the polymorphic nucleotides to determine whether the animal is 
susceptible to a disease or disorder. The animals are preferably sheep, and at least 
one of the animals preferably displays a complex genotype with respect to at least one 
locus. In a preferred embodiment, the identities of polymorphic nucleotides at three 
or more alleles at a locus are determined. In a preferred embodiment, the disease or 
disorder is a transmissible encephalopathy, such as, for example, scrapie. The 
screening can be used to determine whether an animal can be used in a controlled 
breeding program to increase or decrease the prevalence of a particular allotype that 
contributes to the complex genotype. 

Any of the above embodiments can be used in a method of breeding scrapie- 
resistant sheep, where the method comprises determining the identity of polymorphic 
25 nucleotides two or more alleles at the PrP locus of a male sheep and a female sheep 
using the methods of the invention, employing the identities of the polymorphic 
nucleotides to determine whether the male sheep and the female sheep possess two or 
more alleles that are not associated with susceptibility to scrapie; and breeding male 
sheep and female sheep that possess two or more alleles that are not associated with 
30 susceptibility to scrapie. Preferably, animals that harbor alleles associated with 
susceptibility to scrapie will not be used for breeding, particularly where there are 
three or more alleles of the PrP locus. Where an animal has three or more alleles at 
the PrP locus, the outcome of breeding will be less certain than in animals that do not 
display a complex genotype. 
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The methods and compositions of the present invention can be used to 
determine the presence or absence of complex genotypes, and to identify 
polymorphisms of such complex genotypes. Complex genotypes include genotypes 
5 that result from situations where a sample contains nucleic acids from an individual 
where the individual possesses three or more alleles of a nucleic acid of interest. 
Complex genotypes can from an individual exhibiting polyploidy, for example. 
However, complex genotypes can arise even in the absence of polyploidy. Other 
examples of individuals that can exhibit complex genotypes are individuals having 
10 one or more gene duplications, transgenic individuals, trisomy, chromosome 

duplication in whole or in part, and other phenomena such as placental anastomosis 
that result in apparent chimerism. 



Complex genotypes can arise is through placental anastomosis. Placental 
1 5 anastomosis occurs when the placenta of twin fetuses fuse and their bloodstreams 
mix. As a result, stem cells and alleles can mix. The mixing of alleles due to 
placental anastomosis can confer varying dosages of alleles on the fetuses affected, 
and this may vary from tissue type to tissue type in the affected animals. Slight 
mixing of blood might add a relatively low dosage of a particular allele into the 
20 genotype of the affected individual, presenting as an apparent minor anomaly when 
the allele is investigated in an affected individual when using common genotyping 
methodology. The methods and compositions disclosed herein are uniquely suited to 
detecting the presence of complex genotypes, and genotyping, or identifying, the 
alleles associated with complex genotypes. 

25 

An example of a complex genotype that can confound genotyping efforts 
when using conventional genotyping methods occurs in association with the PrP allele 
in the ovine disease scrapie. Scrapie is a fatal neuro-degenerative disease of sheep 
and goats, and is a member of the transmissible spongiform diseases that include 
30 bovine spongiform encephalopathy (BSE) and human Creutzfeldt-Jakob Disease 
(CJD). In common with other transmissible spongiform encephalopathies (TSEs), 
scrapie is characterized by misfolding of the protein product of the prion protein gene, 
PrP. In sheep, the level of susceptibility to infection has been linked to the 
combination of particular alloforms of the protein. In particular, the amino acids 
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present at position 136, 154 and 171 have a strong bearing on the level of 
susceptibility, with the ARR (136 alanine, 154 arginine, 171 arginine) being most 
refractory and VRQ (136 valine, 154 arginine, 171 glutamine) amongst the most 
susceptible. In total there are five alloforms of the protein known, combining to give 
5 15 possible genotypes. The five alloforms of the PrP gene at positions 136, 154 and 
171 are represented by the single letter amino acid code as ARQ, ARR, ARH, AHQ 
and VRQ. 

Conventional genotyping technologies used to detect scrapie have failed to 
10 detect and discern complex genotypes at the PrP locus. For example, current 

genotyping technology for scrapie takes no account of the possibility that there is 
anything other than a normal gene complement of two copies present in an individual. 
However, an animal that has, for example, the four single nucleotide polymorphisms, 
or SNPs, required to designate the animal as ARR/ ARH may actually be 
1 5 ARR/ARH/ARQ, with the third allele contributing no unique bands to indicate its 
presence when using certain conventional genotyping methods and compositions. 
Current assay technologies, which identify only the presence or absence of specific 
nucleotides at the four SNP sites, are insufficient to allow accurate genotyping of 
animals that may have more than two copies of the PrP gene locus. Only by 
20 quantifying the relative proportions of the single nucleotide polymorphisms present in 
a given profile can it be determined whether additional copies of the template DNA 
target are present. Thus, current technologies do not enable an accurate genotype at 
the PrP locus of an animal. 

25 Accurate genotyping of individuals in a population, including those 

individuals exhibiting complex genotypes, is a necessity for a successful controlled 
breeding program. Failure to account for complex genotypes can confound efforts to 
isolate individuals of a population that are susceptible to a disease, or individuals that 
are refractory or resistant to a disease. In the case of scrapie, accurately determining 

30 all genotypes in a sheep population — including complex genotypes — can lead to a 
highly successful controlled breeding program where individuals that are resistant to 
developing scrapie are selectively bred, whereas individuals that are susceptible to 
scrapie are prevented from breeding. 
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The methods and compositions of the present invention have been used to 
determine and identify complex genotypes in a population of sheep at the PrP locus. 
Although the following discussion relates to a sheep population, the methods and 
compositions of the present invention can be used to determine and identify complex 
5 genotypes in any population, including human populations. The methods and 

compositions of the present invention can be used to determine and identify complex 
genotypes in any population, including animals such as mammals. Mammals include 
cows, goats, pigs, primates, rodents, and the like. 

1 0 Genetic susceptibility testing of sheep to scrapie has highlighted that around 

0.1% of assayed animals exhibit a complex genotype at the PrP gene locus. Results 
generated from primer extension analysis on the sheep population studied have been 
observed to be, on occasion, significantly imbalanced. The majority of the 
imbalanced profiles observed can be accounted for by the presence of additional 

1 5 copies of the PrP locus being tested. One explanation for this discovery may include 
gene duplication within or between chromosomes, a trisomy of chromosome 13 (on 
which the ovine PrP gene resides), or placental anastomosis. 

Individuals in a population of animals may be tested for the occurrence of PrP 
20 alleles, including additional copies of the PrP locus. Those animals displaying PrP 
alleles or combinations of alleles that confer susceptibility to scrapie may be 
identified. Similarly, those animals displaying PrP alleles or combinations of alleles 
that are associated with resistance to scrapie or that are not associated with 
susceptibility to scrapie may also be identified. By identifying or detecting and 
25 grouping such animals and separating them from a general population, a breeding 
population may be created that exhibits less, little, or no susceptibility to scrapie. 

Further, sheep sperm or ova obtained from a population of sheep may be 
genotyped with respect to the PrP locus, and those samples exhibiting combinations 
30 of alleles, including three or more alleles, may be separated and stored for use in 

generating populations of sheep that exhibit less, little, or no susceptibility to scrapie. 
Such sperm and/or ova may be employed in artificial insemination, in vivo or in vitro 
fertilization, or other reproductive technologies that are designed to produce animals 
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with desirable characteristics, including but not limited to microinjection of genomic 
material into germ cells and embryos. 

Genotyping of sheep with respect to the PrP locus, with the knowledge of 
5 multi-allelic PrP genes, and with selective breeding of individual sheep selected for 
the presence of desirable genotypes, can lead to a scrapie-resistant sheep population. 

More than 650,000 individual animals have been genotyped by methods of the 
present invention, relying on a proprietary primer extension assay, SNP-IT™, which 
10 interrogates the four SNPs that determine the coding potential at positions 136, 154 
and 171 (codon 171 harbors two polymorphic nucleotides, and can encode arginine, 
glutamine or histidine). 

In one embodiment, the assay employed here relies on the generation of a 
1 5 single 3 1 0 bp amplicon of the scrapie gene that contains all four of the polymorphic 
nucleotides. This amplicon then serves as the template during a multiplexed 
fluorescent primer extension assay. As the extension primers employed in this assay 
are distinct in both size and sequence, they can be separated on a capillary 
electrophoresis apparatus to enable the bases present at the polymorphic sites and 
20 therefore infer the amino acids which will be present in the protein. A preferred assay 
suitable for use with the present invention is disclosed in US Patent Application Serial 
No. 10/179,826, filed 25 June 2002, the entire disclosure of which is hereby 
incorporated by reference. 

25 The profiles generated by the assay developed here have become familiar and 

predictable in terms of peak intensity, both between different SNPs, and more 
particularly, between the different peaks of a heterozygous call. Clearly aberrant 
profiles were initially observed, and initially appeared to be due to a secondary 
contaminating template being present during the initial PCR reaction. However, 

30 repeat analysis and retesting from fresh samples taken from a separate bleed of the 
interrogated animals established that the imbalances are real, and due to the 
occurrence of complex genotypes in the individuals exhibiting the imbalances. 
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DNA sequencing was undertaken to ascertain whether another, as yet 
unobserved, polymorphism was the underlying cause of the imbalance. Sequencing 
failed to reveal any novel polymorphisms. 



5 Applicants have discovered that the imbalanced profiles are due to the 

presence of multiple copies of the PrP gene in certain individuals. As a result, it is 
apparent that an animal that genotypes as, for example, an XY heterozygote may 
indicate an excess of the SNPs related to the X allotype, and the genotype might more 
accurately be reported as XXY, or even XXXY. Applicants have also observed cases 
1 0 where there are three distinct alleles present, of the form ABC, a phenomenon that is 
not explicable unless there are more than two copies of the locus in question. 

This result may have significant impact on the selection of animals for use in a 
controlled breeding program, given that a double dose of a desirable allele may 

1 5 increase the rate of transmission of that desirable characteristic, whereas a double 

dose of an undesirable allele may increase the rate of transmission of the undesirable 
trait to any progeny. This presupposes that the complex genotype observed by typing 
one tissue, typically blood, although not limited to blood, will be similarly represented 
in the DNA of gametes. Any sample that contains or is suspected to contain nucleic 

20 acids can be assayed by the methods and compositions disclosed herein. 

In any of the embodiments described herein, exogenous template DNA may be 
added to the intitial PCR reaction in order to generate a target DNA of sufficient 
abundance that it will promote the primer extension of exogenous probes added to the 

25 PCR product. As these exogenous sequences may be completely artificial, their 

design is highly flexible, and sequences may be chosen that do not interfere with the 
analysis of the genomic DNA polymorphisms being examined. Probes extended in 
this way have, for example, at least three distinct functions: (i) they function to 
demonstrate that the PCR reaction has been successful (ii) they function to 

30 demonstrate the primer extension reaction has been efficient and (iii) they function as 
a assay independent size marker which enables the accurate size assessment of the 
assay specific products. This feature is illustrated in Figures 23 and 24. 
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An advantage of the invention is that it allows the accurate genotyping of 
template DNA samples in which there are greater than two haplotypes of a particular 
form, or an apparent asymmetry in the number of copies of a particular haplotypes 
present in a target DNA sample* This may be observed as, for example, additional 
5 copies of the PrP (prion protein) locus in sheep, or as an asymmetric mixture of two 
or more template DNAs that results in an apparent asymmetry of a specific variant 
nucleotide, or cluster of variant nucleotides identified as a specific haplotype. The 
invention can be applied to a controlled breeding program. The presence of more 
than two haplotypes of a particular form can complicate breeding strategies to 
10 increase or decrease the prevalence of specific haplotypes. A further advantage of the 
invention is that it enables the quantification of the relative abundance of specific 
variant nucleotides in a target DNA sample, and interpretation of the ratio of each 
variant nucleotide. 

1 5 The figures have been simplified for clarity. For example, the extension 

product of a primer that flanks a variant nucleotide is shown as a single peak in the 
figures, as would be the case if the variant position were homozygous. If the variant 
position was heterozygous, two very closely associated peaks may be generated, with 
the two extension products having very slightly different massrcharge ratios, due to 

20 the different terminal base incorporated, and possibly the different labels attached to 
the terminating base. Differences in 5' tags can alter massxharge ratios. 

As employed herein, "S" refers to a G or a C, "R" refers to an A or a G, "Y" 
refers to a T or a C, "K" refers to a G or a T and "M" refers to a C or an A. 

25 

Figure 1 illustrates an actual genotyping profile of a sheep analyzed at the PrP 
locus. The profile was generated using an assay disclosed in US Patent Application 
Serial No. 10/179,826. This profile shows unexpected imbalance at the heterozygote 
positions labeled with arrows 1 (solid black) and 2 (diagonal stripes), where the 
30 expected pattern would have the peak indicated by arrow 2 marginally larger than that 
of arrow 1 . It is very much smaller in this example. Those peaks indicated by arrows 
3 (white) and 4 (horizontal stripes) are also unexpectedly imbalanced, with the peak 
indicated by arrow 4 normally being marginally larger than that indicated by arrow 3. 
In this example it is significantly larger. 
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Figure 2 illustrates an actual genotyping profile in which there appear to be at 
least three haplotypes present in the DNA. Peaks indicated by arrows 1 (solid black), 
2 (diagonal stripes) and 3 (white) are diagnostic in themselves of the specific 
5 allotypes ARR (arrow 1), AHQ (arrow 2) and ARH (arrow 3). In addition to this 
unexpected observation, the indicated peaks are each accompanied by a heterozygote 
partner, and in each case the heterozygote result is unusually asymmetric in 
appearance. 

10 Figure 3 illustrates amplification primers having a mirror of the polymorphism 

in the target nucleic acid. Shown are two amplification primers with tags, where the 
two primers differ only in the identity of a single nucleotide in the tag, and where the 
single nucleotide is flanked on either side by the same nucleotides flanking a single 
nucleotide polymorphism of interest in the target. Bases flanking the 'mirror' 

15 polymorphism (in the 5' tag) are identical to those flanking the 'real' polymorphism 
in the target. One of the amplification primers is modified to have a 5' DNA 
sequence (shown dotted) largely unrelated to the target DNA template, or any other 
DNA sequence from the organism from which the target DNA is derived. This 
amplification primer is a population of two very similar, but distinct sequences, with 

20 the primers annealing to the same target (the complement of the solid arrow 

sequence), but differing from each other in that one single nucleotide in the tag is 
different. This single nucleotide position 'mirrors' the 'real' SNP polymorphism 
targeted. When amplification is performed with a combination of the two primers 
shown, and an opposing primer (not shown) the effect is to generate a pool of 

25 amplicons in which a 'copy' of the (heterozygote form of the) 'real' SNP is generated 
in the terminal end of the amplicons, these 'mirror SNPs' being derived from the 5' 
tag of these hybrid primers. Note that the base immediately before the SNP and 
immediately after the SNP are shown, although not specifically identified. It may be 
that as few as zero bases are copied from the 'real SNP', but it may be preferable to 

30 have more than one base before or after or before and after the SNP to have the 

'mirror SNP' behave in the same or similar fashion as the 'real SNP' with respect to 
relative efficiency of incorporation of chain terminating nucleotides upon a primer 
extension reaction being performed against both the real SNP and the mirror SNP. 
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Figure 4 illustrates a diagnostic amplification of a target nucleic acid sequence 
comprising a single nucleotide polymorphism, and the amplicon that would result. In 
this embodiment, at least three different amplification primers are employed to 
amplify a target nucleic acid comprising a polymorphic nucleotide, such that the 
5 polymorphic nucleotide is included in the resulting amplicon. Two of the primers, as 
illustrated, will have 3 5 ends of identical sequence capable of hybridizing to the target 
nucleic acid at the same sequence, for which they compete equally, so as to amplify 
the region of the target nucleic acid having the polymorphism in it, and will have a 5' 
tag sequence largely incapable of hybridizing to the target sequence. This 5' tag 

10 sequence is shown to contain an 4 R\ representing that both G and A bases are present 
at this position, and that this is the only difference in the sequence of the 5' tag. The 
third primer, also shown, will hybridize to the other strand of the duplex distal to the 
polymorphism, so that the amplicon will contain the polymorphism of interest. The 
two primers that hybridize to the same sequence in the target have 5' tags that are 

1 5 substantially incapable of hybridizing to the target nucleic acid under the conditions 
of the amplification reaction and differ only in a single nucleotide residue in the 5' 
tag. These primers bear the image of the targeted polymorphism in their 5' tag. In 
this embodiment, the A and G in the tags are also flanked with the same nucleotides 
that will flank the targeted polymorphism in the amplicon. The employment of 

20 primers such as those described above affords the ability to amplify a target nucleic 
acid so as to generate an amplicon having a "mirror SNP" generated through 
judicious selection of the 5' tags wherein the "mirror SNP" is generated in a known 
and controlled ratio. In one embodiment, the tagged primers are preferably employed 
in equal ratios in order to generate an amplicon wherein the A:G ratio in the resulting 

25 population of amplicons is 1:1, mimicking a heterozygous site on the same amplicon 
as the polymorphic site amplified from the target nucleic acid. An illustration of this 
embodiment is shown in Figure 4, where the A:G ratio generated by the 5' tags of the 
primers is referred to as a "mirror SNP," and the polymorphic site amplified from the 
target nucleic acid is referred to as a "real SNP." As can be seen in this embodiment, 

30 each mirror and real SNP resides in the same sequence context in that the bases 
flanking each mirror and real SNP are identical. This embodiment provides an 
advantage in that heterogeneities in primer extension reactions carried out at these 
sites that may be due to sequence context differences are advantageously reduced. 
Further, this embodiment is particularly advantageous in that the signal generated 
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from the mirror SNP should provide an observer with a clear heterozygote signal 
following a single base primer extension reaction. The generation of a reliable 
heterozygote mirror SNP allows for the assessment of the situation at the targeted 
real SNP, giving a reference point as to the efficiency of incorporation of a G 
5 terminator and an A terminator where each template is in equivalent abundance. The 
examination of the relative efficiency of incorporation of the terminating bases 
following primer extension against the real SNP present in the amplicon allows 
determination of the zygosity of this polymorphism, that is whether it too is a 
heterozygote, or is significantly skewed towards incorporation of one or other of the 
10 bases exclusively, as would be expected of a homozygote SNP. 

Figure 5 illustrates the products of a 50:50 diagnostic amplification and shows 
how the lower or upper strand may be employed in a diagnostic primer extension 
reaction. Figure 5 illustrates an embodiment wherein an amplicon has been generated 

15 having a mirror SNP and a real SNP. The mirror SNP and real SNP are present on the 
same amplicon and are therefore in molar equivalence, regardless of the efficiency of 
the PCR reaction. Both positions can be interrogated simultaneously in using distinct 
and distinguishable SNP-IT™ primers which must hybridize to the same strand, so 
that the sequence context around the SNP is maintained between the mirror SNP and 

20 the real SNP, and so that any influence this sequence context might have on the 

efficiency of incorporation of one chain terminating nucleotide over the other might 
be normalized between the mirror and real SNP interrogations. Note that if 
interrogating the mirror and real SNPs as an addition of C and T terminators from the 
upper strand (the extension product of the tagged primer), the excess amplification 

25 primer should be efficiently removed (by, for example, Exo I digestion) prior to the 
SNP-IT™ extension reaction. Failure to do so may result in the excess primer being 
available to act a template during the SNP-IT™ extension reaction, and loss of the 1:1 
molar ratio of the mirror:real SNP. Interrogation of the mirror SNP and real SNP as 
addition of an A and G terminator on the lower (daughter) strand of the hybrid primer 

30 avoids this problem, but requires that sufficient DNA sequence is provided 5' of the 
mirror SNP position in the hybrid primer to allow stable hybridization of a SNP-IT™ 
primer on the daughter strand. Note that the terminal 3' base in the extension primers 
is shown to be identical to the base immediately preceding the mirror and real SNP in 
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the appropriate direction, and that the base beyond the variant nucleotide is also 
maintained between mirror and real SNP in this example. 

Figure 6 illustrates three possible outcomes where the target comprises a 
5 single source of template DNA, employing an A/G polymorphism for illustration 

only. Panel A results where the target SNP is homozygous GG; Panel B results where 
the target SNP is heterozygous AG; Panel C results where the target SNP is 
homozygous AA. The mirror SNP and real SNP are present on the same amplicon 
and are therefore in molar equivalence, regardless of the efficiency of the PCR 

10 reaction. Both positions can be interrogated simultaneously by using distinct and 
distinguishable SNP-IT™ primers which hybridize to the same strand, so that the 
'sequence context' around the SNP is maintained between the mirror SNP and the real 
SNP, and so that any influence this sequence context might have on the efficiency of 
incorporation of one chain terminating nucleotide over the other might be normalized 

1 5 between the mirror and real SNP interrogations. It is not an absolute requirement that 
the signal strength between the mirror SNP heterozygote and the real SNP result are 
of equivalent intensity, although this would be the most advantageous situation. It is 
only necessary that there be sufficient signal intensity at the mirror and the real SNP 
that a ratio between the heterozygote peaks at each can be determined which is above 

20 the level where stochastic fluctuations and artifactual noise may account for a 
significant portion of the detected signal. It may also be advantageous to set an 
acceptable level of signal detection from the (artificial) mirror SNP before any result 
from the associated real SNP is taken as valid. This may be of particular utility in 
multiplex analyses where failure of one specific amplicon to amplify to acceptable 

25 levels could be ascertained by examining the signal from that amplicon' s mirror SNP. 

Figure 7 illustrates how outcomes would appear in a multiplex assay of eight 
polymorphisms from a single source template DNA. For any given amplicon, assayed 
at both the mirror (M) and real (R) SNP, there are only three possible outcomes for 
30 the real SNP, given a single source of template DNA. The mirror SNP will always 
return a heterozygous signal, given that this amplicon has been produced to assayable 
levels within the multiplex, and the ratio of incorporation of one nucleotide over the 
other will be measurable in some fashion, although the absolute level of incorporation 
may differ from M to R (despite their molar equivalence), and from individual SNP to 
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SNP. The signal returned by the real SNP will either be very close to this 
heterozygote signal ratio, or is will be severely skewed to one or other side of the 
ratio, indicating that the real SNP was a homozygote of one or other flavor. By flavor 
is meant the type of SNP, which will be one of the six different combinations that are 
5 possible (AG, AC, AT, TG, TC, GC) The signal returned in this case may also be 
larger than that returned by the mirror SNP, although not necessarily so, given that the 
SNP-IT™ primers used to interrogate the mirror and real SNPs may have differing 
hybridization characteristics. However, the ratios between peaks generated from 
mirror and real SNPs will be comparable, given that all experimental variables and 

10 sequence context variables are automatically normalized using this described system 
of analysis. A panel of eight unidentified SNPs has been used here for demonstration 
only. Note that SNP 5 is shown to have generated particularly weak signals for both 
the mirror and the real SNP, as might be expected if the targeted amplicon had failed 
to generate effectively in the multiplex analysis. Taking a ratio from such results will 

1 5 be more prone to error in the ratios generated. This system would be applicable to 
any SNP panel, or any mixture of different SNPs, given that each of the nucleotide 
species used as a terminator has a distinguishable characteristic. Also, the signal 
returned is shown as peaks, perhaps on a capillary electrophoresis instrument where 
the real SNP SNP-IT™ primer migrates more slowly than the mirror SNP primer, and 

20 the terminating nucleotides carry some detectable label, such as a fluorescent dye. 

The mirror SNP heterozygote result is shown proximal to the corresponding real SNP 
result, but it need not be the case that mirror and real SNP results are close to each 
other as shown, only that they are distinct. Any system that can discriminate between 
the products of the SNP-IT™ reaction would be applicable to the technique. 

25 

Figure 8 illustrates a single tube test for the interpretation of a possible mixed 
DNA sample, targeting a polymorphism that is a transversion flanked on either side 
by nucleotides that are complementary to one another. Solid box indicates a 
nucleotide complementary to hatched box. The ratios shown and the use of G/C are 
30 for example only. One skilled in the art will understand that any transversion and any 
ratio could be used in this assay. Also, only a single flanking nucleotide is shown 
around the real and mirror SNP. This is for clarity only in the diagram. It may be that 
more than one nucleotide on one or other or both sides of the SNP are required to give 
the same, or similar, efficiency of incorporation of the terminating bases upon primer 

54 



extension, or that as few as zero bases need be complementary. Note that the ratio of 
3:1 (G:C) on the forward primer shown is achieved by carrying out the initial PCR 
reaction with the forward primer bearing the G at a concentration 3 times higher than 
the forward primer bearing a C. This ratio is reversed on the daughter strand, as there 
5 will be 3 times as much C on the daughter strand as there will be G. The reverse 
primer is also shown as a mixed population of two primers differing only in the 
identity of a single nucleotide. Here the ratio of G-bearing primer to C -bearing 
primer is controlled in the initial PCR to be equivalent, generating a mirror SNP in the 
amplicon at 1:1 ratio. Note also that the bases around both the artificially introduced 
10 polymorphic sites are reversed to maintain the sequence context of the mirror SNP, 
matching the real SNP. 

Figure 9 illustrates a quadruplex reaction wherein the four extension primers 
are used to generate data from three know ratios of G:C and a polymorphic S residue 

1 5 of unknown ratio. The known ratios of G:C are 3:1 , 1:1, and 1 :3. Either, but not 
necessarily both, of the real SNP S primers, is required. A single asterisk indicates 
that only one of the primers so marked is necessary, and one of these primers may be 
judged to have preferable sequence characteristics over the other, and therefore be the 
preferred choice in a primer extension reaction. The primers marked Exo I indicate 

20 that these primers should only be used given the efficient removal of the 

complementary initial amplification primer prior to the identification step. Note that 
the 1:1 polymorphism could also be interrogated on the other strand (primer not 
shown) but this would necessitate that a larger 5' tag sequence be used on the initial 
amplification primer, to give sufficient template DNA in the amplicon to support 

25 stable hybridization of the primer employed for primer extension. In order to both 
generate data on the three ratios of G:C produced (3:1, 1 :3 and 1:1) and information 
on the real SNP, one need only use four extension primers, as shown. The use of only 
four extension primers will reduce the analysis required, and increase the potential to 
multiplex the analysis of different SNPs, with each SNP requiring the analytical 

30 'space' (be it on a capillary, or other analysis readout platform) to fit all the different 
extension products whilst maintaining distinct identification from each other. Note 
that regardless of the extension primers used, the terminal 3' base of all primers is 
demonstrated to anneal to a 'hatched box' nucleotide, which is the complementary 
base of the 'solid box' nucleotide. As demonstrated, the sequence context around the 
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targeted SNPs is maintained regardless of which strand is being utilized as template 
during the identification reaction. 

Figure 10 illustrates how results from a quadruplex single tube G:C (3:1, 1:1 
5 and 1 :3) assay would ideally appear. A graph of relative efficiency of incorporation 
of the X and Y terminators (here shown as G and C) may be generated, and the 
observed efficiency of incorporation at the 'real SNP' placed somewhere on this 
graph, indicating the relative proportions of the X and Y nucleotide that must have 
existed in the template DNA used to seed the reaction. This graph is for illustrative 
1 0 purposes only, as it is unlikely that the graph generated by this technique will in fact 
be a straight line as shown, particularly if the primer extension reaction has been 
allowed to proceed to the point of saturation (that is, where some necessary 
component of the primer extension reaction has been exhausted). 

15 Figure 1 1 illustrates the more likely shape of the graph of relative 

incorporation of two terminating nucleotides in a primer extension reaction. The 
graph contacts the axis at the two extreme points of homozygosity, and presumes that 
even at these points, the quantity of amplicon being analyzed is equivalent. It will be 
the case that for each individual SNP there will be a specific mathematical function 

20 that describes the shape of the curve, and this shape will be derived empirically for 
each SNP flavor in a necessarily limited number of local sequence contexts. 

Figure 12 illustrates how a transversion polymorphism can be assayed with up 
to six extension primers in a single tube. A single asterisk indicates that both of these 

25 target identification primers may be used, and their results averaged, giving careful 
consideration to the fact that in a homozygote, or mixed, sample one or other of these 
results must first be changed to the reciprocal value, as any skewed value will be 
reversed on the opposite strand. A double asterisk indicates that both of these tag 
identification primers may be used, and their results averaged, as these do represent a 

30 true balanced heterozygote when assayed on either strand. Primers labeled Exo I 
should only be used if amplification primers are efficiently removed prior to the use 
of the primers labeled Exo I. It is possible to analyze the three different SNPs with up 
to 6 SNP-IT™ primers, analyzing each on both the upper and lower strand. Where 
the mirror SNP has generated a ratio switch on the other strand, this must be analyzed 
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on both strands in order to generate information on the efficiency of terminator X and 
Y at both ratios. However the other mirror SNP, generated from the other 
amplification primer, will not undergo this ratio switch when it is duplicated. It will 
merely reproduce a copy of itself at 1 : 1 ratio again. These two 1:1 ratios could be 
5 analyzed on both strands using two distinct SNP-IT™ primers, and the efficiency of 
incorporation of the X and Y terminators averaged between the two extension 
reactions. Similarly, the 'real SNP' could also be analyzed on both strands, and a 
consensus of the ratio of X and Y generated, bearing in mind that this will switch 
from strand to strand given that the 'real SNP' is imbalanced as a result of being a 
10 mixture of more than one template. Mathematical correction of such an observed 
switch should be done before a consensus ratio is calculated. 

Figure 1 3 illustrates how results of a hexaplex one-tube SNP ratio matrix 
might appear. A graph of relative efficiency of incorporation of the X and Y 

15 terminators (here shown as G and C) may be generated, and the observed efficiency 
(average) of incorporation at the 'real SNP' placed somewhere on this graph, 
indicating the relative proportions of the X and Y nucleotide which must have existed 
in the template DNA used to seed the reaction. Note also that the efficiency of the 1:1 
'mirror SNP' is an average, although to emphasize this point both the 1 :1 mirror SNP 

20 and the 'real SNP' points are shown as two closely associated points on the graph. 
This graph is for illustrative purposes only, as it is unlikely that the graph generated 
by this technique will in fact be a straight line as shown. 

Figure 14 illustrates the more likely shape of the graph of relative 
25 incorporation of two terminating nucleotides in a primer extension reaction. The 

graph contacts the axis at the two extreme points of homozygosity, and presumes that 
even at these points, the quantity of amplicon being analyzed is equivalent. It will be 
the case that for each individual SNP there will be a specific mathematical function 
that describes the shape of the curve, and this shape will be derived empirically for 
30 each SNP flavor in a limited number of local sequence contexts. Note that the 1:1 
ratio and the real SNP analysis can be averaged after being analyzed on both strands, 
and appropriate remedial action taken to account for any deviation from a 1 : 1 ratio 
that might be encountered at the target-derived real SNP. These points are shown as 
closely associated points to emphasize this only. 
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Figure 15 illustrates how results from a multiplex reaction assaying a plurality 
of polymorphisms would appear. In the case illustrated, six polymorphisms are 
analyzed using four extension primers per SNP, three interrogating the known ratio 
mirror SNPs and one interrogating the real SNP. As before, it will be possible to 
carry out multiplex analysis of many 'real SNPs' at once, whilst generating a standard 
curve or other mathematical predictor for each, and reading the relative proportions of 
the X and Y nucleotides from the individual graphs (where X and Y represent the two 
nucleotide species possible for any individual SNP). By combining the information 
from many different SNPs co-analyzed in this manner, it will be possible to come to a 
consensus as to the proportions of individual DNA templates present in a mixture, and 
this will be facilitated if one of the individual's DNA profiles is known (for example, 
the profile of the victim of a sexual assault). 

Figure 16 demonstrates an alternative and most preferable empirical means of 
defining a mathematical function between the ratio of incorporation of one nucleotide 
over the other in a primer extension reaction for a given SNP in a specific sequence 
context. Illustrated are eight different ratios of a transversion SNP introduced into 
both terminal ends of an amplicon population through an amplification reaction, 
although the actual ratios used may be more or less extensive than those shown here. 
These differing ratios permit the plotting of a relationship between actual level of 
incorporation, and the known ratio of availability of template DNA. It may not be 
possible to carry out these various ratio checks in a single tube assay, and for this 
reason, a 1:1 ratio control is incorporated into the amplicon populations to verify that 
the level of incorporations here is the same, and that the results from the various 
skewed mirror SNPS can be assembled together into a mathematical relationship. It 
may be that having completed this work for one SNP, the mathematical relationship 
will be applicable to all SNPs flanked by specific sequences, and it may be that this 
work will have to be repeated for each SNP flavor in all local sequence environments. 
This will be a necessarily limited number of experiments. 

Figure 17 illustrates the simplest relationship between efficiency of 
incorporation of two chain terminating nucleotides, which is a linear relationship, and 
also a more complicated mathematical relationship in the form of a logarithmic/linear 
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relationship. Other mathematical functions, such as exponential functions, may also 
describe the efficiency of incorporation of chain terminating nucleotides at specific 
ratios, and these may be determined empirically for each specific SNP flavor in a 
specific local sequence context. Converting the relationship to a linear relationship 
5 facilitates the determination of an unknown ratio from an observed ratio of 
incorporation of terminating nucleotides. 

Figure 1 8 illustrates a practical example of the invention described. The 
diagram illustrates the amplification of a portion of the ovine PrP gene using initial 

10 amplification primers (block arrows) which flank 4 polymorphic nucleotides at 

positions 136, 154 171-1 and 171-2. These polymorphic nucleotides are re-created 
upon amplification by virtue of the attachment of 5' tag sequences to the initial 
amplification primers. Each primer re-creates two of the polymorphic nucleotides 
(136 and 154 in the tag of the forward primer, and 171-1 and 171-2 in the tag of the 

1 5 reverse primer). Each of the polymorphic sites is re-created as a balanced 

heterozygote as a result of use of equimolar amounts of the initial amplification 
primers (a total of two distinct forward primer sequences, and two distinct reverse 
primer sequences). Only the local sequences around the genuine and re-created 
heterozygote sites are shown (2 bases 3' and 1 base 5', with respect to the amplicon 

20 target of the primer extension reaction). The genuine polymorphic sites are 

interrogated as shown using four primers (solid block arrows, 5' end only indicated) 
whereas the re-created balanced heterozygotes are interrogated by a distinct set of 
four primers (hatched block arrows). Each pair of interrogating primers is distinct, 
but share commonality over the terminal two bases at their 3' ends. Note that the 

25 primers interrogating the re-created 136 and 171-1 polymorphic sites are 

complementary to a proportion of the block arrows (the template-specific portions of 
the initial amplification primers). This is a means of limiting the necessary length of 
the 5' tag sequence required, which benefits the efficient synthesis of these elongated 
hybrid primers. Note also that due to the nature of the 171-1 and 171-2 

30 polymorphisms (these sites are immediately adjacent), it is not possible to have these 
re-created sites completely reflect the real situation with respect to the single 
nucleotide 5' of the polymorphic site being interrogated. Here, the more common of 
the two possible bases has been indicated in each case. None of the primer extension 
probes shown indicate any 5' modifications, such as poly T mobility modifiers, which 
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can be used to affect the position to which the various species migrate under 
electrophoresis. These modifiers are omitted from the diagram for clarity. 

Figure 19 illustrates the output from an analysis of ovine DNA as in Figure 1 8, 
5 but indicates each of the real SNPs as a heterozygote: something which is not 

expected to occur in nature. The ratio between each of the nucleotides incorporated at 
the mirror SNP can be compared to the ratio observed at the real SNP, and judgment 
made as to whether this ratio is indicative of a balanced heterozygote at the real SNP 3 
or if there is a distortion which may indicate the presence of additional copies of the 

10 PrP gene in the original template. Note that the mirror SNPs are shown running more 
slowly (with greater apparent molecular mass) than the real SNP extension products. 
This need not be the case, so long as each species is uniquely identifiable. Further 
note that the absolute areas between the real and mirror SNPs need not be the same, 
but merely the ratio of the areas between the incorporated terminators at any given 

15 heterozygote pair. Sizing controls are also indicated in this representation. 

Figure 20 illustrates that the described system can be used to monitor for the 
presence of a deletion as the variant nucleotide. The product produced from this 
amplification is limited to show the generation of a product which contains the 

20 targeted deletion, but may also be used to generate a population of amplicons in 

which both the deletion and the wild type target DNA are equally represented. This 
can be achieved by combining equivalent concentrations of initial amplification 
primers with the deletion and the wild type sequence represented in the 5' tag 
sequence. The primer used to probe this artificial representation of the variant 

25 nucleotide could be targeted to extend against a nucleotide within the deleted 
sequence, or the invariant nucleotide 5' to the deletion site. A large number of 
potential targets for extension can be envisaged to affect the detection of the deletion 
and the wild type sequence, and which one is most appropriate will be dependant on 
such variables as the extend of the deletion, and the DNA sequence in and around the 

30 site of the deletion. 

Figure 21 illustrates that the system can also be used to detect insertions, in a 
manner analogous to the detection of deletions. The amplification primer is shown to 
bear a 5' tag sequence which mimics the sequence of the insertion targeted, but by 
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combining equal proportions of initial amplification primers, one bearing the insertion 
in the 5' tag and the other bearing the wild type sequence of the target DNA, a 
population of amplicons will be produced in which the insertion and the wild type 
sequence are equally represented. Interrogation of the variant nucleotide in the 5' tag 
5 sequence can be directed either at a nucleotide within the insertion, or at an invariant 
nucleotide 5' to the site of the insertion. A large number of systems can be envisaged 
to affect the detection of the insertion and the wild type sequence, and which one is 
most appropriate will be dependant on such variables as the extend of the insertion, 
and the DNA sequence in and around the site of the insertion. 

10 

Figure 22 illustrates that a form of variant nucleotide may be introduced to the 
amplicon using two initial amplification primers each bearing an almost identical '5 
tag, but differing in a variant nucleotide, which may be a single nucleotide, or a 
number of nucleotides, such as a deletion or an insertion. This system will generate a 

15 population of amplicons in which the variable nucleotide is represented on opposite 
strands of the amplicon, but the variable nucleotide forms will be balanced at a ratio 
approaching absolute 1:1. This system overcomes the problem of having to balance 
two separate primers competing for the same target DNA specific hybridization site. 
However, as a result of the extreme degree of homology between the 5' tags, it is 

20 possible that the initial amplification primers 5 5' tags could bind to their own 

extension products 3' ends, either inter-molecularly, or intra-molecularly. Careful 
design of assay conditions, and the thermal profile during the amplification in 
particular, is a necessity to ensure this does not interfere with the PCR amplification. 

25 Figure 23 illustrates the general layout of an exogenous control system. The 

two long lines indicate two artificial, complementary sequences. These are added at 
known copy number to an initial PCR, and are amplified by the indicated PCR 
primers (PCR 1 and PCR 2) concomitantly with any amplicons generated in the PCR 
from the analytical template DNA and primers. When the artificial amplicons have 

30 been generated, they will be available to promote the predictable primer extension of 
the indicated probes. Two probes can be used, targeting the same base within a 
region of palindromic DNA, which may be designed to be a restriction site, for 
example. Multiple probes can be used with differing length tags on the 5' end 
(indicated by bracketed dotted lines). 
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Figure 24 illustrates one specific set of DNA sequences that works efficiently 
as an exogenous control DNA sequence. Reference should be made to Figure 23 for 
the function of each DNA sequence. The restriction site specified in the center of this 
5 100 bp artificial construct (SEQ. ID NO. 38) is a Pvu II restriction site (underscored). 
The targeted bases are the Ts within the Pvu II recognition site, targeted on both the 
upper and lower strands by probes of differing sequence, but which have sequence 
commonality at their 3' ends. The palindromic nature of the Pvu II site is extended by 
two further bases prior to the recognition site (overscored on the upper strand only), to 

10 further enhance the sequence similarity on the upper and lower strands, such that the 
probes have 3 bases at their 3' end in common, but the sequence from which they 
prime is common over 1 0 bases (3 before the targeted base, the targeted base itself, 
and then a further 6 bases after the targeted base). In this example, the probes extend 
to incorporate a terminating A (bold, underscored). The 5' ends of the primers and 

1 5 probes are shown for clarity only. The primer and probe of the upper strand (SEQ. ID 
NOs. 39 and 40) and the primer and probe of the lower strand (SEQ. ID NOs. 41 and 
42) are shown aligned over the amplicon (SEQ. ID NO. 38). 

Figure 25 illustrates the result obtained when the construct described in Figure 
20 18 was generated and then probed using only the 171-1 real and 171-1 mirror SNPs. 
The figures show the patterns obtained when (i) a homozygote GG real SNP was used 
as template (ii) a heterozygote GA real SNP was used a template and (iii) a 
homozygote AA was used as template. Note that in all three cases the mirror SNP 
that has been artificially introduced to the amplicons is interrogated to give a 
25 heterozygote profile of similar balance to the real heterozygote (ii) example. 

■ 

Figure 26 illustrates the results of analysis of the construct described in Figure 
1 8 where each of the four real SNPs and each of the mirror SNPs, and also the four 
invariant control probes, are being used in a multiplex primer extension reaction. The 
30 image shows that each of the real SNPs is a homozygote, whereas each of the mirror 
SNPs is a heterozygote, as anticipated. The position of the mirror SNP heterozygotes 
is indicated in the satellite boxes. The mirror SNP probes co-migrate in this example, 
but by judicious introduction of additional bases at the 5'end of the mirror probes 
these will be separated such that each occupies a discrete region of the 
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electrophoretogram. This change has been introduced to the 'real SNPs,' which 
migrate to discrete regions of the electropherogram. 

Figure 27 illustrates the component parts of a SNP -IT™ interrogation of SWaP 
5 SNP TSC0084838. This SWaP SNP was amplified using 5 'tagged primers balanced 
as 1 :1 on both forward and reverse primer mirror SNP positions. The resulting 
amplicons will therefore have a 1 :1 ratio of G:C at all of the mirror positions- 
Analysis shows that the C terminator is incorporated with much lower efficiency than 
is the G terminator. However, this example demonstrates that the imbalance apparent 

10 at the real SNP, which is a heterozygote in this example, is normal for this particular 
SNP, and the similar imbalance at each of the mirror positions gives high confidence 
that this is the case. Note that as the primers were used at 1 : 1 ratio in this example, 
Mirrors 1 and 2 have reversed to deliver a 1 :1 ratio on the daughter strand. Real SNP 
3 also has the same ratio of incorporation, and the final mirror 4 also delivers a 

15 credible 1:1 balance. Note that in this example, Mirror 2 was placed in an inverted 
palindromic sequence, such that it is not a true mirror. It is therefore surprising that 
there appears to be little difference in the efficiency of incorporation of the G and C 
terminators at this polymorphism and the other mirror and real SNP positions. 

20 Figure 28 illustrates a 310 bp amplicon (SEQ. ID NO. 43) produced during the 

initial amplification phase of the scrapie assay. The positions of four polymorphic 
sites within the sequence are indicated by Y, R, R 5 and K, corresponding to changes in 
codons 136, 154, and 171 (two immediately adjacent SNPs). 

25 Figure 29 is an annotated representation of a scrapie assay, not modified to 

include 5' tags with mirror SNPs. The sequence shown is the plus strand only (SEQ. 
ID NO. 44). Thus, some of the highlighted sequences refer to the inverse complement 
of the true primer/probe sequence. Initial amplification primer positions are shown in 
bold. Probes are shown underlined, flanking the SNP positions. Probes anneal to 

30 different strands of the target amplicons, such that 136 incorporates either a C or T, 
154 incorporates either a C or T, 171-1 incorporates either a G or A and 171-2 
incorporates either a C or A. The 154 and 171-1 probes anneal to different strands of 
the amplicons, but they share six complementary bases over their terminal 5' ends, 
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shown bold. The amino acid translation for any particular SNP variant is shown 
bracketed on the right hand margin. Sections of substantially invariant sequence are 
used to generate control extension product. Probe sequences are shown italicized, 
with 5' poly T additions indicated bracketed at the 5' terminus of the template- 
5 specific portion of these control probes. There are in total four control probes 

competing for two target sequences, and as the shorter probe is positive and the longer 
probes are negative, all of these flank an invariant G incorporation. 

It will be appreciated by those of skill in the art, after having read and 
10 understood this disclosure, that a large plurality of embodiments employing the 

compositions and methods taught by this invention can be carried out without undue 
experimentation. Such embodiments include combinations of the embodiments 
disclosed herein. Further, one skilled in the art will appreciate that the introduction of 
exogenous sequences into amplicons by employing 5' tags comprising one or more 
15 variant nucleotides affords great versatility in designing identification primers. 

Further still, the employment of 5' tags in identification primers, such as for purposes 
of identification, capture, and/or detection, will similarly be appreciated by one skilled 
in the art as an advantage that affords great versatility for analysis of results. These 
and other advantages will become apparent to one skilled in the art upon reading and 
20 understanding this disclosure. 

One skilled in the art will appreciate that through judicious choice of 
exogenous 5' sequences attached to identification primers, large multiplex 
amplifications can be constructed that can generate products capable of aiding both 
25 the interpretation of individual detection primer reactions, and in the overall 

interpretation of the multiplex assay, by utilizing the individual primers as control 
components in the assay. 

In a preferred embodiment of the invention, analysis of the products of the 
30 primer extension reactions can be done so as to determine the relative abundance of 
labeled identification primers. Abundance analysis can be undertaken by comparing 
the identity of the nucleotide incorporated into an identification primer, the identity of 
the identification primer (that is, whether it is a probe of a 5' tag sequence or a 
naturally occurring polymorphism in the target nucleic acid), the signal strength of the 
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identification primers, and then comparing the relative signal strengths of the primers 
to determine the relative success of each of the primer extension reactions that 
occurred (that is, the amplification and identification reactions). In this way, one 
skilled in the art can troubleshoot a primer extension reaction, or a combined 
5 amplification-primer extension reaction, by examining the relative abundance of the 
labeled primers and comparing the signals observed from known primers to the 
known ratios of variable nucleic acids induced by the 5' tag sequences into the 
amplicons. In this way, one skilled in the art can learn, in a single reaction run, 
whether problematic results arose due to sub-optimal amplification, sub-optimal 
10 extension of the variant nucleotide, or a host of reaction parameters once the 

disclosure of this invention is in hand. This embodiment of the invention may be 
employed to advantage in multiplexed and high-throughput protocols, greatly 
simplifying troubleshooting of these reactions. 

1 5 Being able to define the efficiency of incorporation of each of the nucleotides 

at a polymorphic site has great utility in the field of diagnostic genotyping, where the 
certainty of the result is critical. For example, in the filed of agricultural genotyping, 
it has recently been shown that the ovine PrP gene is frequently present in multiple 
copy numbers which complicates the analysis of this gene. Having a balanced 

20 heterozygote signal generated as part of the amplicon required to analyze PrP enables 
the rapid assessment of any samples which might display this phenomenon. Also, in 
pharmacogenomic analysis of large numbers of polymorphisms in a single reaction, 
having a balanced heterozygote produced as part of the amplicon enables 
confirmation of the production of that specific amplicon to assayable levels within the 

25 multiplex, and further provides a heterozygote polymorphism which mimics the 
specific polymorphism targeted, and thereby enable comparisons to be made and 
surety of the result called for each of the polymorphisms in the multiplex. In the 
forensic context, where mixed template samples are possible, the system described 
enables the generation of a standard curve, or linear relationship, between the 

30 efficiency of incorporation of one nucleotide over the other, and enables the 

assessment of levels of each nucleotide which must have been present in the original 
template. Here, any of a great number of polymorphisms may be utilized, and their 
characteristics assessed such that they can be combined in large multiplex reactions. 
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Many other applications of the specialized primers and methods taught herein 
will become apparent to one of ordinary skill in the art with the teaching of this 
disclosure in hand, including paternity testing, pharmacogenomic analysis, and the 
like. 

5 

Having now generally described the invention, the invention may be more 
readily understood through reference to the following examples, which are provided 
by way of illustration and are not intended to limit the present invention unless 
specified 

10 

EXAMPLES 

The current invention has facilitates the analysis of two different types of 
samples: namely single source template DNA samples of high importance (medical 
1 5 diagnostic samples, for example) and secondly in the analysis of samples which may 
contain template DNA from more than one individual, as may be encountered during 
forensic DNA analyses. 

Example 1 

20 

When analyzing DNA from a single source, it is possible to introduce artificial 
representations of the polymorphisms under investigation on the same amplicons as 
the targeted polymorphisms, such that the artificial representations are present at 
exactly the same concentration as the targeted polymorphisms. It is further possible 

25 to ensure that the sequence context of the true polymorphism is mimicked in the DNA 
flanking the artificial representation of the polymorphism. All other variables which 
may influence the efficiency of DNA polymerase-mediated nucleotide addition are 
automatically normalized between the real SNP and its artificial copy during a primer 
extension reaction simultaneously interrogating both real and artificial polymorphism. 

30 These variables include, for example, salt concentration, pH, thermal profile, 

concentration of PCR components (Mg ++ , buffer, additives such as BSA, dNTPs etc). 

As an example, the SNP TSC0096009 has the following sequence: 
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gttggctttc gtgtttgctg ctgtcctcat agatttcaca tggattagag gtcctccaaa 
tggagtgctg cccaccttga ccactctttc ccatgcttct tgcctgctgc ttcacatggt 
ccaggtggac tgcttttctc cccgcttaca tttcctagaa agtgccctgc tcaccctttt 
ctctggatgc tcactcaggg gttttaccag gcctgaactc tctcca 

R(a/g) 

gctgtgccgc ttcacccaac tgaggccttc tcattcttca ctttgtagtc aaggaatctg 
cagcccagaa gctcctccat tttcctccag actagcccag gtctcatacc ctttggtttc 
acctttctgt acttctttca tgttgcccag gataattcct catcattact tgtcaaatgg 
ttgtgttctc cctgggctac agattagatg aggttgggaa ttcccttttc actgcctctg 
tatctcaata gcagccccat gccaaacact tcccagggac tgagtaaaga tttccccaaa 
gggtgagtga atgttgagga aaggcagaaa gcaatcctcc ttaagtggga tatcagaatg 
ctgagcttaa cttgaaaccg tttctaaacc atagactctt atttaaagga aaccaacatg 
aaaatgccaa caccacctta tttacaaggt actttgttca ctagagctat taaagggctg 
tgttgatggg aagctgtgta taattgtagg tattatgcca gagaccgctt tctgtcaggc 
tgccagacca aaggggtagg gaccgtactc tagagaccct cacccaacag gatgattaaa 
cgaatttgta agggttaata gatgggcggt ggctcattaa aaccaactct aa (SEQ. ID NO. 1 ) 

The polymorphic base is an R (G or A), which can be analyzed as either a G/A, or as 
a C/T on the complementary strand. The region around this polymorphic site may be 
amplified in a standard PCR reaction using the following primers: 

Forward (upper) Primer: 5 'ccctgctcacccttttctctggatgct 3 ' (SEQ. ID NO. 2) 

Reverse (lower) Primer: 5 gagaaggcctcagttgggtgaagcg 3 ' (SEQ. ID NO. 3) 

These primers have annealing temperatures of Tm 71.6°C and 70.1°C respectively, 
and amplify a 97 bp PCR amplicon. A larger PCR amplicon is generated than this 
using the disclosed technology, where a 5' tag sequence unrelated to the target DNA 
modifies one of the primers such that it has a 'mirror' of the real targeted SNP. This 
'mirror' is generated by replacing the forward primer (for example) with a 1 :1 blend 
of the following two sequences: 

Forward 4 T' primer: 

5 tcctcgattac*ttgtca£ ccctgctcacccttttctctggatgct 3 ' (SEQ. ID NO. 4) 
Forward 'C primer: 

5 tcctcgattac*ctgtca gccctgctcacccttttctctggatgct 3 (SEP. ID NO. 5) 
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The 5' tag sequence is shown underlined, with the position of the artificially 
generated SNP shown preceded by an asterisk. A small number of additional 
hybridizing bases may be included at the junction of the 5' tag sequence and the 5' 
end of the template specific sequence in order to counter the effects of the 5 5 tag 
5 sequence causing the local disruption of hybridization here (the Q in the sequences 
above, for example, is a hybridizing base absent from the forward primer without the 
5 5 tag sequence, SEQ. ID NO. 2). The reverse primer remains unaltered from that 
shown above (SEQ. ID NO. 3). Note that the product generated upon amplification 
using these primers and the common reverse will have two polymorphic sites 
10 represented: the original real SNP targeted, and mirror of this SNP in the DNA 

derived from the 5' tag sequences. Both sites have identical flanking bases (two bases 
to the 3' on the strand to be interrogated, and one base 5' to the interrogated site). 
Both these SNPs can now be interrogated using the following two primers, which will 
incorporate G/A bases: 

15 

Real SNP probe (forward): 5 ggttttaccaggcctgaactctctcca 3 '(SEQ. ID NO. 6) (Tm 
68.1°C) 

Mirror SNP probe (reverse): 5 agaaaagggtgagcagggctgaca 3 (SEQ. ID NO. 7) (Tm 
20 67.0°C) 

Note that the terminal two bases at the 3' end of each of these primer 
sequences is CA (shown underlined ), but that 5' of these two bases, there is imperfect 
homology such that under specific stringent conditions the two primers will not cross 
25 hybridize. The sequences are also different lengths, such that under analysis of the 
extension products on a fluorescent capillary electrophoresis instrument the results 
appear as two distinct but closely associated peaks (from the mirror SNP 
heterozygote) and an associated single or doublet peak which derives from the real 
SNP (see for example Figure 6). 

30 

The ratio of incorporation of each of the terminating bases is maintained 
between the mirror SNP heterozygote and the real SNP heterozygote, allowing for 
accurate genotyping of the real SNP. 
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Example 2 



In forensic applications, it may be required to analyze template DNA which 
5 originates from more than one individual. In these circumstances, it is beneficial to 
assume that every forensic template is a potential mixture, and to routinely analyze 
the DNA in such a manner as to enable the identification of a mixture, and the 
subsequent interpretation of the mixture is facilitated. Using transversion 
polymorphisms flanked by complementary bases enables these requirements to be 
10 met. 

Amplification of the SNP TSC00 18292 will be used as an example of the 
analysis of a G/C SNP flanked by an A and a T. This will limit the sequence context 
'mirror' of the analyzed polymorphism to one base 5' and one base 3'. The sequence 
15 ofthisSNPis: 



5' flank: ctgccaagtg tagagtcgtc agggagcagg ccaggctggg ggctccctct gcccctgacc 
cctgggggag ctgctgggag agtcctggcc tctcctgcat gtgcgtggct tgctttttgg 
ctggactaag gattgcagcc atatgaaatg ctcattgctg tcctcatccc cctcccattg 
20 gctgtcctgg aa (SEQ. ID NO. 45) 

SNP: S(c/g) 

3' flank: tcagctcctt tctgcagggc agccactgca cacctttctt ctgtgtcctt tcaggatgtc 

25 ctgtgcacac acaagtatat atatatacac atatgtgtac acacacatat ataaatccta 

ggattagaat ctctggctca agggattttg tgtcctgtag atactgtgtt ttcgtttttc 
tgactttttc ctgcacactg tagactacac cgtgtgctac cctgcatttg cgattatcag 
ggaacatgtc ttggacgtcg tccacagcag cccctccaga cctgcccatt cctcctgctc 
aggcattcca tactgtgaat cacttgctta accacacctt gactgatggg gacacttact 

30 tcttttcact gtgtcttata atgcagccct ggatatcctt acacttattt ccttggctac 

ttgtatgagg acctttgtag gattaaattt gataactaga attgtggatc aaaaggtttg 
tgcattttca ctttgataag gatgaccaca ccctaggatg gttggctggg atccctttct 
ctaacat (SEQ. ID NO. 8) 

35 

The S (G or C) polymorphism can be amplified using the following primer sequences: 
Forward (upper) Primer: 5 ccatatgaaatgctcattgctgtcctca 3 
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(SEQ. ID NO. 9) 



Reverse (lower) Primer: 5 gacacagaagaaaggtgtgcagtggctg 3 
(SEQ. ID NO. 10) 

5 

These primers have annealing temperatures of 68.3°C and 70.1°C respectively, and 
amplify a 102 bp amplicon. A significantly larger amplicon will be generated if these 
primers have 5' tag sequences added which will enable the artificial recreation of the 
targeted SNP at known concentrations. Suitable primers to enable this analysis are 
1 0 shown below: 

5' tag Forward (upper) Primers: 

5 ccaaagatcctctggagctaactcctatggtcta*gtt gccatatgaaatgctcattgctgtcctca 3 
(SEQ. ID NO. 11) 

15 

and 

5 ccaaagatcctctggagctaactcctatggtcta*ctt gccatatgaaatgctcattgctgtcctca 3 
(SEQ. ID NO. 12) 

20 

5' tag Reverse (lower) Primers: 

5 aaatcggttggattcgcttgacggaagtattgaga*gtcgt ggacacagaagaaaggtgtgcagtggctg 3 (SEQ. 
ID NO. 13) 

25 and 

aaatcggttggattcgcttgacggaagtattgaga*ctcgt ggacacagaagaaaggtgtgcagtggctg (SEQ. 
ID NO. 14) 

30 An asterisk precedes the base in the 5' tags that results in the generation of a 

controlled variant base in the amplicon population. When the forward (upper strand) 
primers are used at a skewed ratio of, for example, 3x the G bearing primer to Ix the 
C bearing primer, this generates a mirror SNP in which there is a 3x higher 
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representation of the G base than the C base. This will be reversed on the daughter 
strand such that there is a 3x higher representation of the C base over the G base. The 
reverse (lower strand) primers must be combined in the amplification reaction to be at 
equivalent concentrations of the G and C bearing sequences. This results in a mirror 
5 SNP that is a balanced heterozygote regardless of which strand is analyzed. 

Skewed 3:1 Mirror SNP Forward Probe (upper): 

5 ' (T n )agatcctctggagctaactcctatggtcta 3 ' (SEQ. ID NO. 15) (Tm 65.3°C) 

10 Skewed 1:3 Mirror SNP Reverse Probe (lower): 

5> (T n )acagcaatgagcatttcatatggcaa 3 ' (SEQ. ID NO. 16) (Tm 65.8°C) 

Balanced Het Mirror SNP Forward Probe (upper): 

5 ' (T n )gcacacctttcttctgtgtccacga 3 ' (SEQ. ID NO. 17) (Tm 66.1°C) 

15 

Balanced Het Mirror SNP Reverse Probe (lower): 

5 (T n )ggattcgcttgacggaagtattgaga 3 ' (SEQ. ID NO. 18) (Tm 65.9°C) 

Note that each of these primers terminates in an A at the 3 5 end (shown 
20 underscored) and that each of the probes will extend to incorporate either a G or a C. 
The subsequent base in the amplicon template is then a T, and this sequence context is 
maintained regardless of strand. Further note that each probe is modified to include a 
number of non-hybridizing bases (for example, a number of Ts, here represented by 
T n ). These additional bases are included to provide a means of altering the apparent 
25 migration of each extended probe to occupy a unique and predictable position on 
electrophoresis. 

In addition to these probes which are designed to interrogate to the artificial 
mirror SNPs, the following two probes are used to interrogate the real SNP, which is 
30 the target of the initial amplification: 

Forward (upper) Probe: (T n )cctcccattggctgtcctggaa (SEQ. ID NO. 19) 
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Reverse (lower) Probe: 5 (T n )gctgccctgcagaaaggagctga 3 (SEQ. ID NO. 20) 



In common with the probes interrogating the artificial SNPs, these probes have 
an A at their 3' terminus, ensuring sequence dependant effects are normalized. When 
5 these probes extend, they will incorporate either G or C or both G and C dependant on 
whether the original template DNA was homozygous (for G or C) or heterozygous 
(for G and C). The subsequent base in the template is again a T. If the original 
sample was a mixture of more than two individuals, it will be possible to identify this 
given that the combination of the two templates is not homozygous for either G or C, 
10 and that if heterozygous, combined templates do not represent an apparent balanced 
heterozygote, as might be generated by the combination of two (or more) individual 
templates which are opposite homozygotes in equivalent proportions, or the 
combination of two (or more) individual templates which are heterozygotes (and 
regardless of relative proportions). 

15 

Again, each of the primer extension probes listed above is modified to include 
a number of T bases at their 5 9 end, in order to separate these extension products to a 
unique area of the electrophoretogram. 

20 The system described above will generate output similar to that shown in 

Figure 1 5, which shows 6 SNPs concomitantly analyzed with only four extension 
primers per SNP system. This trace, and the mathematical manipulation of the data 
contained therein, allows the relationship between absolute ratio and observed ratio of 
areas (areas under each peak) to be determined, and the observed ratio of areas from 

25 the real SNP to be related back to an absolute ratio. For clarity, only one of each of 
the potential 1:1 mirror SNP extensions and one of the real SNP extensions is shown 
in Figure 15. 

In order to make the association between observed ratio of areas and absolute 
30 ratio of bases present at a certain polymorphic site, it is necessary to define the 

mathematical relationship between the absolute ratio and the observed ratio. This can 
be done for the TSC001 8292 SNP by using the previously listed 5' tag primers (SEQ 
ID NOs 1 1 and 12) at a much wider range of ratios of G bearing primer to C bearing 
primer, but maintaining the other primers (SEQ ID NOs 13 and 14) at 1:1 ratio to 
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provide a control from amplification tube to amplification tube. From the amplified 
DNA, only the mirror SNPs need be interrogated, to build a mathematical relationship 
between the absolute ratios and the observed ratios of areas. Indeed an artificial 
system may be developed to generate all possible SNPs with all possible local 
5 sequence contexts, without the need to amplify a variant DNA region, as only the 
artificially generated SNPs are required to be interrogated. 

Example 3 

Standard Analysis of the PrP locus for Scrapie Susceptibility and the 
10 Observation of Complex Genotypes 

The PrP Locus 

Four SNPs of commercial interest lie within the coding region of the ovine 
PrP gene (the sequence of which is available at GENBANK accession number 

1 5 M3 1313, and is hereby incorporated by reference). These SNPs may be assayed by 
multiplexed chain-terminating primer extension. Since these SNPs lie in close 
proximity to one another, they can be assayed from a single PCR amplicon of 3 10 bp. 
This amplicon provides the target for four detection primers, each of which flanks the 
3' end one of the four SNPs of interest. There is however a significant amount of 

20 invariant DNA also represented on the 310 bp amplicon, and this invariant DNA can 
be used as the target for control primers which extend against invariant bases, and so 
generate predictable products, irrespective of the bases present at the SNP sites. 

By selecting control and detection primer sequences, it has been possible to 
25 develop a single tube assay that interrogates the SNPs, and generates four labeled 
controls that flank the labeled detection primers. Two of the controls migrate under 
electrophoresis with an apparent mass smaller than all of the possible labeled 
detection primers. These controls both target the same core DNA sequence within the 
310 bp amplicon, and interrogate the same invariant base. They differ only in the 5 5 
30 terminus, which is longer by two T bases in 50% of the primers that anneal to the 
target sequence. Two further controls migrate with a larger apparent mass than the 
detection primers. These are generated by two control primers that target another 
section of invariant sequence within the 310bp sequence, and differ only in that one is 
two T bases longer than the other, this tag also being an addition to the 5' terminus. 
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Extension of any control primer results in the incorporation of a G, which in this case 
carries a fluorescent dye that returns a blue signal under laser illumination. Flanking 
the labeled detection primer products in this way allows a Local Southern sizing 
algorithm to be applied to precisely size the labeled detection primer products. 

Template Preparation 

Ovine DNA was prepared from 20 microliters of venous blood taken into a 
NaEDTA vacutainer using a modified alkali lysis procedure, involving a 1 00 mM 
ammonium chloride wash and two 50 mM NaCl/0.1 mM EDTA washes, with 
centrifugation to recover the white cell pellet between washes. Lysis was achieved by 
room temperature agitation of the recovered white cell pellet in 50 mM NaOH. The 
lysed cells were then neutralized by addition of 100 mM Tris HC1, pH 7.5, and 
dilution with sterile deionized water. 

PCR amplification 

Three microliters of template DNA was combined with 3 microliters of PCR 
Mastermix, containing 200 nM forward and reverse primers, 200 micromolar dNTPs, 
2.0 mM MgCl 2 , lx Gold Buffer (ABI, Warrington, UK), 100 pg/microliter heat 
inactivated BSA. Thirty- two cycles of PCR were performed, sufficient to generate 
approximately 5 ng of 310 bp amplicon, judging from ethidium bromide stained 
agarose gel electrophoresis. 

EXO/SAP treatment 

In order to destroy the excess amplification primers and unincorporated 
dNTPs, SAP (USB) and EXO (New England Biolabs) were added directly to each 
amplified product well. The plate was then heated at 37°C for 60 minutes, followed 
by heat inactivation of enzyme activities at 72 °C for 15 minutes. 

Primer Extension 

Two and a half microliters of the EXO/SAP-treated amplicon was combined 
with 2.5 microliters of SNaPshot components (ABI, Warrington, UK), which 
contained the DNA polymerase, fluorescent dNTPs and a combination of proprietary 
probes that flank the four SNPs of interest, and also invariant positions within the 
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amplicon. Twenty-five cycles of extension were performed with annealing at 57°C 
for 20 seconds, followed by 40 seconds of extension at 60°C. Heat denaturation 
between extension cycles was for 10 seconds at 95°C. 



5 Extension probes were designed to flank the four polymorphic bases, 

extending to incorporate a C/T at 136 in the forward direction, C/T at 154 in the 
reverse direction, A/G at 171-1 in the forward direction and C/A at 171-2 in the 
reverse direction. In addition, four control extension primers were included that were 
targeted against two invariant G incorporations, and produced predictable results 
10 regardless of the polymorphic bases present at 136, 154 and 171. These invariant 
products were designed to flank the diagnostic extension products upon 
electrophoretic separation, and provide both internal control and size marker used 
during the analysis of the results. 

15 Calf Intestinal Phosphatase Digestion 

As the extension products were to be electroinjected on an ABI 3100 Capillary 
Electrophoresis instrument (ABI, Warrington, UK), the unincorporated fluorescent 
terminators were rendered neutral by digestion with calf intestinal phosphatase (CIP, 
New England Biolabs). CIP was added directly to the products of the primer 

20 extension reaction, and the samples were returned to incubate at 37°C for 60 minutes, 
and then the enzyme activity heat destroyed by incubation at 95°C. This high 
temperature also served to completely denature the fluorescent products to prepare 
them for analysis. 

25 Polymorphisms 

The 3 1 0 bp amplicon generated as the target harbors the 4 SNPs that 
determine the allotypes that are produced by an individual. The possibility that the 
source of the observed imbalance lay in a novel polymorphism in the PrP gene that 
30 was either underlying the binding site for one or other of the initial amplification 
primers, or the binding site(s) of the four extension primers was investigated. 
In order to sequence the entire 310 bp region, including the binding sites of the initial 
amplification primers, primers were designed that annealed further outside of the 
amplicon, and an amplicon was generated. 
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Across a number of imbalanced samples (n = 8), sequencing of the larger 
amplicon revealed no new SNPs not already recorded in the literature. In addition, 
seeding of a primer extension reaction with the larger amplicon generated imbalanced 
5 profiles that were indistinguishable from those generated by the shorter 3 1 0 bp 

amplicon. Thus, evidence from two sources indicated that the imbalance is not rooted 
in allele-specific amplification of one of two alleles present. In retrospect, this was 
unlikely, as there were observed incidences where the imbalance indicated an excess 
of one allele when considering the 154 result, but an excess of the other allele when 
10 assessing the 171 result — a situation that is clearly incorrect. This simultaneous 
fluctuation of intensity of more than one SNP at a time also argues against being a 
new SNP underlying the extension primer binding site, as this would require a 
plurality of novel SNPs. 

15 To date approximately 500 instances of animals that demonstrate unacceptably 

imbalanced profiles has been recorded by the inventors. This figure is approximately 
0.1% of the tested animals. On reviewing the profiles together, the imbalances were 
observed at all 4 SNPs, and (when the genotype was called from absolute presence or 
absence of a peak only) for all genotypes. When examining multiple repeats, and 

20 resampling of animals, the results were consistent. It was determined that template 
DNA containing at least three regions encoding the PrP sequence was targeted by the 
initial amplification primers. 



Pedigrees from animals tested using the SNP-IT assay were constructed, 
25 anticipating that a gene duplication will be observed as inheritance of an imbalanced 
profile in a proportion of offsping, or that a trisomy may display the inheritance of an 
allotype not thought present if the genotype is derived merely by absolute presence 
and absence of peaks (given that the trisomy does not affect the normal fertility of the 
parent). One interesting pedigree obtained demonstrates the inheritance of a silent 
30 third allele. 

Determination of an individual animal's scrapie susceptibility is inferred from 
the combination of PrP allotypes. An animal determined to have both ARR and VRQ 
allotypes may be used in controlled breeding programs in anticipation that the 
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beneficial allele will be passed with equivalent frequency, with half of any progeny 
inheriting the desirable ARR allotype. 

The results shown in this example may, depending on the clarification of the 
5 type of gene duplication, result in certain animals which were previously thought to 
be of limited breeding potential, but in light of the results described herein, actually 
being of greater value, and others of tolerable genotype being reclassified as of 
unacceptable genotype. For example, an animal classed as a triploid ARR/ARR/VRQ 
will, if the characteristics are heritable as separate elements, generate more ARR 
10 bearing progeny than VRQ. Conversely, animals determined to be ARR/VRQ/VRQ 
will pass the undesirable VRQ allele will greater frequency. 

Example 4 

1 5 Having observed that imbalance is detectable at the PrP locus, Applicants 

designed a novel assay incorporating the inventions disclosed here, such that any such 
imbalance will be more certainly detected in routine laboratory operations. 

Illustrated below is an example of how the present disclosure can be used in 
20 the analysis of four SNP sites within a portion of the PrP gene from sheep (Ovis 

aries). This example has aspects of both the amplification of a single source template, 
and the interpretation of a mixed template, as the ovine PrP gene may be present in 
greater than two copies per cell in some animals, resulting in imbalanced (and 
apparently 'mixed') profiles being generated. This description may be better 
25 understood with reference to Figures 18, 25 and 26. 

The initial amplification was undertaken using the following primer sequences: 

5' tag Forward (upper) Primers: 
30 tgaggatccactggatagctgaagctctggaca*cgatg*acatcg tcaaggtggtagccacagtcagtggaacaag 
(SEQ. ID NO. 21) 

and 
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5 tgaggatccactggatagctgaagctctgRaca*tgatg*gcatcgt caaggtggtagccacagtcagtggaacaag 
(SEQ. ID NO. 22) 



5 5 'tag Reverse (lower) Primers: 

atgcacgcatagctgatacggtcacctgta*acatc*cgaag gtggtggtggtgactgtgtgttgcttgac (SEQ. 
ID NO. 23) 



and 

10 

atgcacgcatagctgatacggtcacctgta*ccatc*tgaag gtggtggtggtgactgtgtgttgcttgac (SEQ. 
ID NO. 24) 



As before, nucleotides directing the generation of a variant nucleotide in the 
1 5 amplicon population are preceded by an asterisk. Note that there are two such 
positions in the 5' tags of each of the primers. Use of these primers in equivalent 
concentrations in the initial amplification generated an amplicon of 390 bp, with the 
four targeted SNPs recreated in the terminal ends of the amplicons such that they can 
be interrogated to return a balanced heterozygote signal for each. These balanced 
20 heterozygotes were generated by adding the following sequences to the primer 
extension reaction: 



Mirror 136 Probe: 5 (T n )tgactgtggctaccaccttgacgatg 3 (SEQ. ID NO. 25) 



25 Mirror 154 Probe: 5 (T n )tccactggatagctgaagctctggaca J (SEQ. ID NO. 26) 
Mirror 171-1 Probe: 5 (T n )aacacacagtcaccaccaccaccttc 3 (SEQ. ID NO. 27) 

Mirror 171-2 Probe: 5 '(T n )cacgcatagctgatacggtcacctgta 3? (SEQ. ID NO. 28) 

30 These probes are modified at their 5 5 end to include non-hybridizing bases, 

which are represented by (T n ). These sequences do not contribute to the binding of 
the probe to the target sequence within the amplicon, but merely modify the position 
to which the extended probes migrated to under electrophoresis, ensuring that the real 
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SNP probes and the mirror SNP probes will migrate with distinct properties. An 
image of the hypothetical output from this system is presented as Figure 1 9, where the 
real SNPs are all shown as the heterozygote form, whereas in reality this is unlikely to 
ever occur for this particular ovine system. However, the mirror SNPs are shown to 
return a balanced heterozygote signal for each SNP. The ratio between the area of 
each peak in the heterozygote mirror serves as a confirmation of the heterozygosity of 
the corresponding real SNP. An actual image of the electropherogram generated on 
execution of this experiment is presented as Figures 25 and 26, with Figure 26 
demonstrating incomplete separation of the mirror SNPs. Complete separation can be 
achieved by modifying the 5' tags of the mirror SNP probes. 

It might be expected that an imbalanced profile would deviate significantly 
from the normal heterozygote area ratios defined by the mirror SNPs, and such an 
imbalanced profile may be indicative of additional genetic material being present in 
the template used to seed the amplification reaction. It is possible to assess the area 
ratios at the mirror SNPs and use these ratios to automatically assess the balance 
observed at the appropriate real SNPs, and pass or fail a profile as being normal or 
imbalanced. This functionality is additional to the ability to automatically genotype 
SNPs using this primer extension technology. 

Example 5 

A SWaP SNP was amplified and it was successfully demonstrated that the 
introduced ratio of 1 :1 is maintained on both strands, at a heterozygote real SNP. 

TSC84838 has the DNA sequence: 

5 f flank: taatagaaac tacaggctaa ttacctgaat tatatatttt tttttccatg atgtcctcca 
agttccaggt aagtatgttt atttgtgatt gtcattttca tgtggatgcc tatgtttctg 
ggagatctat gcccttctcc aagttctggt gaagaaggtt tggagacagc cactacccaa 
aatgtatttg tcttcattct tcaccttgct aaatcttacg cattttaagg tcccagcttt 
ctattcctcc attcaaaaaa cataatttga ttatttgctt ctattccaat attctttgta 
tttcccacac agggtatatt accttaggtg tcctaagaga tttttgcctt tcaatgtacg 
caagcccagc acatgccatg gtacatagta gagattttct ctctctcgct gtctctctct 
ct (SEQ. ID NO. 46) 
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Observed: S(c/g) 

3 f flank: agagcgaata tacacataca ttttgaacaa gtttatatat ttttactcca aactcagttc 
tgatgcccag atggagaaaa aaataggaga aaaatatttt tccatgtaag aaaaagtata 
5 ccagtgaagc aaaatgcatt gtgcttctta tctaattatt gctacataaa aacacaaaat 

tttatctcta cgatttcaga actatttact tgataaccta gtagtaaaag gaattttgta 
tgttctcatt ttgcacattg tctcattcag actttccttt tatatgtatt atcttcagtg 
ttaacatatt gtaaaatgtt tcattgtcac tcatattctg attttaagac agaagtacat 
tttaagcatc aatttcaact taaacaaaat tgccttctca caaaattggc tgttattctg 
1 0 ataaccaaaa gggctaagtg gaagagacat ataattactt attctaaaat tgtagaattt 

ggcaagaagt gagacttatt tgattcattt ataaaacatg taaacaaaag acagttatcc 
tctgcctgaa ttaaaataga tgagtttttc ataaaaataa ataagtgact gttctcatgc 
(SEQ. ID. NO 29) 

The G/C SNP can be amplified with the following primer pairs, introducing a copy of 
15 the polymorphic base at both end of the amplicons: 

G-bearing forward: 

ttccaatctttacggtatgtcgcccatcttgct*gagtagtgagccatggtacatagtagagattttctctctctcgct (SEQ. 
ID NO. 30) 

C-bearing forward: 

20 ttccaatctttacggtatgtcgcccatcttgct* cagtagtgagccatggtacatagtagagattttctctctctcgct (SEQ. 
ID NO. 31) 

G-bearing reverse: 

gaga*gtcctccatctgggcatcagaactgagtttggagta (SEQ. ID NO. 32) 

C-bearing reverse: 

25 gaga*ctcctccatctgggcatcagaactgagtttggagta (SEQ. ID NO. 33) 

The palindromic sequence before and after the introduced mirror SNP is 
identical to that which flanks the real SNP, and is shown underscored. The introduced 
mirror SNP is in each case shown preceded by an asterisk. Note that the sequence of 
the palindrome on the reverse primers is not a true mirror. This primer was used to 
30 determine whether this had an observable effect on the efficiency of incorporation of 
the G and C terminators at this pseudo-mirror SNP. 
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This amplicons is 174 bp in length, and can be probed using the following 
probe sequences: 

Mirror 1 reverse: 

agagaaaatctctactatgtaccatggctcactact (SEQ. ID NO. 34) 
Mirror 1 forward: 

tctttacggtatgtcgcccatcttgct (SEQ. ID NO. 35) 
Real reverse: 

ataaacttgttcaaaatgtatgtgtatattcgctct (SEQ. ID NO 36) 

Pseudo-Mirror 2 forward: 

tcagttctgatgcccagatggagga (SEQ. ID NO 37) 

Note that all of these probes terminate in the same two bases (CT), except the 
Mirror 2 Forward, which terminates GA and is not a true mirror SNP. However, the 
results of all these extensions favor the introduction of the G terminator over the C 
terminator, and a clear indication that balance in the electropherograms is due to 
efficiency of incorporation, not asymmetry of the template DNA used in the assay. 
Note that even the inadvertently pseudo-mirror SNP 2 has incorporated the G with 
greater efficiency than the C terminator (Figure 38). 

This result implies that it may not be necessary in all cases that the S or W 
SNP lies amid a palindromic sequence. However, Applicants anticipate that this will 
vary from SNP to SNP and flanking sequence to flanking sequence. 

While the invention has been described in connection with specific 
embodiments thereof, it will be understood that it is capable of further modifications 
and this application is intended to cover any variations, uses, or adaptations of the 
invention following, in general, the principles of the invention and including such 
departures from the present disclosure as come within known or customary practice 
within the art to which the invention pertains and as may be applied to the essential 
features hereinbefore set forth and as follows in the scope of the appended claims. 
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